An efficient Machine Learning pipeline for online data repair towards demographic parity

E. M. De-Diego, P. Gordaliza, J. López Fidalgo

Fair learning is an active area of research which tries to ensure that predictive algorithms are not discriminatory towards any individual based on personal characteristics. Recent preprocessing methods focus on repairing data by mapping the conditional distributions of sensitive groups to their Wasserstein barycenter through optimal transport plans. However, they depend on the available data, which poses limitations in the Machine Learning production environment. As time passes, new data becomes available, maybe acquired within a different socio-economic context, providing an opportunity to generate new predictions by retraining the Artificial Intelligence model. We propose a pipeline integrating an efficient algorithm for treating online data, which consists in an interpolation function to compute the repair version of new data. An efficient open-source implementation is available, which serves as evidence of how the gap between continuous and empirical transport theory are achieved.

Keywords: Algorithmic Fairness Machine Learning Demographic parity Optimal Transport

Scheduled

Data Analysis

November 7, 2023 6:40 PM

HC2: Canónigos Room 2

Other papers in the same session

Análisis Topológico de Datos y Redes Complejas del Mercado de Valores de España durante la Covid-19

A. Mateos Caballero, V. Alcaraz López, A. Domínguez Monterroza, A. Moreno Díaz, A. Jiménez Martín

Adaptive Minimax Classification by Tightly Tracking Underlying Distributions

V. Álvarez, S. Mazuelas, J. A. Lozano

Two characterizations of the dense rank

J. L. García Lapresta, M. Martínez Panero

An efficient Machine Learning pipeline for online data repair towards demographic parity

Other papers in the same session

Cookie policy