M. Navarro García, V. Guerrero, M. Durbán, A. del Cerro
Sparse feature selection is an active research topic in supervised learning frameworks, which strives to build interpretable models without sacrificing accuracy. Especially in the high-dimensional regime, it is desirable to assume that the true active set is sparse. In this work, we address the best subset selection problem in a general setting where the variables may enter the model as linear and/or non-linear. The regression model is stated as a mixed integer quadratic optimization (MIQP) problem, and we propose a matheuristic approach based on the Akaike Information Criterion of the smooth components. In addition, we introduced a general framework based on the group lasso algorithm that provides solutions which significantly improve the performance of the MIQP model in terms of the sizes of the problems to be handled. Our approach is compared with other state-of-the-art methodologies, proving to be competitive in terms of predictive power both in synthetic and real-world data sets.
Keywords: Feature selection, Additive models, Mixed-integer programming
Scheduled
GT03.AMC1 Machine Learning
November 7, 2023 6:40 PM
CC2: Conference Room