E. Boj del Val, A. Grané
The aim of this work is to study the performance of distance-based predictive models in the presence of outliers in moderately large data sets of multivariate heterogeneous predictors. For this purpose, several metrics in the predictors’ space, such as classical and robust versions of Gower’s distance, are compared and their effectiveness in the prediction of responses is evaluated by means of the mean squared error, as well as other goodness of fit measures. Computations on real data sets are made using the dbstats package for R.
Keywords: db-gml; dbstats; mixed-type data; outliers; robust Gower’s distance; R
Scheduled
GT03.AMC4 Clustering and Classification
November 9, 2023 3:30 PM
CC3: Room 1