P. Delicado, C. Pachón García

We present a set of algorithms implementing multidimensional scaling (MDS) for large data sets with n individuals. When n is large, MDS is unaffordable with classical MDS algorithms because their extremely large memory and time requirements. We overcome these difficulties by means of three non-standard algorithms based on the central idea of partitioning the data set into small pieces, where classical MDS methods can work. In order to check the performance of the algorithms as well as to compare them, we have done a simulation study. Additionally, we have used the algorithms to obtain an MDS configuration for a EMNSIT: a real large data set with more than 800000 points. We conclude that the three algorithms are appropriate to use for obtaining an MDS configuration, but we recommend to use any of the two new proposals since they are fast algorithms with satisfactory statistical properties when working with big data. An R package implementing the algorithms has been created.

Keywords: Computational efficiency, Divide and conquer, Gower’s interpolation formula, Landmark MDS, Procrustes transformation

Scheduled

GT18.SOFTW1 Invited Session
November 7, 2023  4:50 PM
HC1: Canónigos Room 1


Other papers in the same session

SurvLIMEpy: A Python package implementing SurvLIME

C. Pachón García, C. Hernández-Pérez, P. Delicado, V. Vilaplana


Cookie policy

We use cookies in order to be able to identify and authenticate you on the website. They are necessary for the correct functioning of it, and therefore they can not be disabled. If you continue browsing the website, you are agreeing with their acceptance, as well as our Privacy Policy.

Additionally, we use Google Analytics in order to analyze the website traffic. They also use cookies and you can accept or refuse them with the buttons below.

You can read more details about our Cookie Policy and our Privacy Policy.