E. M. De-Diego, P. Gordaliza, J. López Fidalgo
Fair learning is an active area of research which tries to ensure that predictive algorithms are not discriminatory towards any individual based on personal characteristics. Recent preprocessing methods focus on repairing data by mapping the conditional distributions of sensitive groups to their Wasserstein barycenter through optimal transport plans. However, they depend on the available data, which poses limitations in the Machine Learning production environment. As time passes, new data becomes available, maybe acquired within a different socio-economic context, providing an opportunity to generate new predictions by retraining the Artificial Intelligence model. We propose a pipeline integrating an efficient algorithm for treating online data, which consists in an interpolation function to compute the repair version of new data. An efficient open-source implementation is available, which serves as evidence of how the gap between continuous and empirical transport theory are achieved.
Keywords: Algorithmic Fairness, Machine Learning, Demographic parity, Optimal Transport
Scheduled
Data Analysis
November 7, 2023 6:40 PM
HC2: Canónigos Room 2