Robust clustering and visualization of heterogeneous multivariate data
In this work we propose two new robust metrics for multivariate heterogenous data and study their performance as auxiliary tools in clustering through k-medoids algorithm. Additionally, Multidimensional Scaling is used for clustering visualization. The new proposals performance is evaluated through a collection of synthetic and real datasets, with outlying contamination, as well as compared to classical metrics by means of adjusted accuracy and adjusted Rand index. A Python library with the new proposals has been developed.
Keywords: Clustering k-medoids mixed-type data robust metrics