K. Mengersen
Many datasets of interest to statisticians are subject to privacy conditions. This can constrain access, analysis, sharing and release of results. In this presentation, we will consider two ways in which this issue might be addressed. The first is through federated learning, in which the analysis is undertaken in such a way that the data remain in situ and private. The second is synthetic generation of the data, such that the simulated data retains salient characteristics but retains the required privacy. We provide some extensions to the class of models that can be considered in federated learning, and an overview of synthetic generation of tabular data. The exposition of these ideas will be motivated by the creation of an Australian Cancer Atlas.
This research is in collaboration with QUT colleagues Conor Hassan and Dr Robert Salomone, and is funded by the Australian Research Council and Cancer Council Queensland.
Key reading:
C Hassan, R Salomone, K Mengersen (2023) Federated variational inference methods for structured latent variable models. arXiv preprint arXiv:2302.03314
C Hassan, R Salomone, K Mengersen (2023) Deep generative models, synthetic tabular data and differential privacy: an overview and synthesis. arXiv preprint arXiv:2307.15424
Palabras clave: Privacy conditions, Federated learning, Synthetic generation of the data, Cancer Atlas, Biostatistics
Programado
Sesión Plenaria de Estadística: K. Mengersen
8 de noviembre de 2023 09:00
CC1: Auditorio