M. Comas Cufi, J. Palarea Albaladejo, J. A. Martín Fernández, G. Mateu Figueras

Compositional analysis of multivariate count data has grown in popularity in recent years, particularly in the molecular biosciences. This recognises that the observed abundances only represent a fraction of the actual abundances in the studied environment. However, sparsity is a common issue, with data sets often containing over 70% zero entries. Crucially, zeros impede the computation of either logarithms (common in ordinary analysis) or logratios (used in CoDA), and this has motivated different workarounds. We present a replacement method based on the logratio-normal-multinomial distribution, compounding the logratio-normal and multinomial distributions. It offers a model-based, flexible alternative to common, often oversimplistic, practices. However, it requires dealing with computation burden issues regarding model parameter estimation. Different formulations to enable its practical feasibility, especially in high-dimensional contexts, are discussed and compared by simulation.

Keywords: Compositional data, multivariate analysis, zeros, imputation

Scheduled

GT03.AMC3 Compositional Data
November 9, 2023  4:50 PM
CC3: Room 1


Other papers in the same session


Cookie policy

We use cookies in order to be able to identify and authenticate you on the website. They are necessary for the correct functioning of it, and therefore they can not be disabled. If you continue browsing the website, you are agreeing with their acceptance, as well as our Privacy Policy.

Additionally, we use Google Analytics in order to analyze the website traffic. They also use cookies and you can accept or refuse them with the buttons below.

You can read more details about our Cookie Policy and our Privacy Policy.