Related papers: Exposing the Illusion of Fairness: Auditing Vulnerabilities to Distributional Manipulation Attacks

Exposing the Illusion of Fairness: Auditing Vulnerabilities to Distributional Manipulation Attacks

URL: http://arxiv.org/abs/2507.20708v1
Date: Mon, 28 Jul 2025 11:01:48 GMT
Title: Exposing the Illusion of Fairness: Auditing Vulnerabilities to Distributional Manipulation Attacks
Authors: Valentin Lafargue, Adriana Laurindo Monteiro, Emmanuelle Claeys, Laurent Risser, Jean-Michel Loubes,
Abstract summary: Regulation-driven audits increasingly rely on global fairness metrics.<n>We show how to manipulate data samples to artificially satisfy fairness criteria.<n>We then study how to detect such manipulation.
Score: 4.44828379498865
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Proving the compliance of AI algorithms has become an important challenge with the growing deployment of such algorithms for real-life applications. Inspecting possible biased behaviors is mandatory to satisfy the constraints of the regulations of the EU Artificial Intelligence's Act. Regulation-driven audits increasingly rely on global fairness metrics, with Disparate Impact being the most widely used. Yet such global measures depend highly on the distribution of the sample on which the measures are computed. We investigate first how to manipulate data samples to artificially satisfy fairness criteria, creating minimally perturbed datasets that remain statistically indistinguishable from the original distribution while satisfying prescribed fairness constraints. Then we study how to detect such manipulation. Our analysis (i) introduces mathematically sound methods for modifying empirical distributions under fairness constraints using entropic or optimal transport projections, (ii) examines how an auditee could potentially circumvent fairness inspections, and (iii) offers recommendations to help auditors detect such data manipulations. These results are validated through experiments on classical tabular datasets in bias detection.

Related papers

On the Interconnections of Calibration, Quantification, and Classifier Accuracy Prediction under Dataset Shift [58.91436551466064]
This paper investigates the interconnections among three fundamental problems, calibration, and quantification, under dataset shift conditions.<n>We show that access to an oracle for any one of these tasks enables the resolution of the other two.<n>We propose new methods for each problem based on direct adaptations of well-established methods borrowed from the other disciplines.
arXiv Detail & Related papers (2025-05-16T15:42:55Z)
Noise-Adaptive Conformal Classification with Marginal Coverage [53.74125453366155]
We introduce an adaptive conformal inference method capable of efficiently handling deviations from exchangeability caused by random label noise.<n>We validate our method through extensive numerical experiments demonstrating its effectiveness on synthetic and real data sets.
arXiv Detail & Related papers (2025-01-29T23:55:23Z)
Conformal Validity Guarantees Exist for Any Data Distribution (and How to Find Them) [14.396431159723297]
We show that conformal prediction can theoretically be extended to textitany joint data distribution. Although the most general case is exceedingly impractical to compute, for concrete practical applications we outline a procedure for deriving specific conformal algorithms.
arXiv Detail & Related papers (2024-05-10T17:40:24Z)
A Brief Tutorial on Sample Size Calculations for Fairness Audits [6.66743248310448]
This tutorial provides guidance on how to determine the required subgroup sample sizes for a fairness audit. Our findings are applicable to audits of binary classification models and multiple fairness metrics derived as summaries of the confusion matrix.
arXiv Detail & Related papers (2023-12-07T22:59:12Z)
Assumption violations in causal discovery and the robustness of score matching [38.60630271550033]
This paper extensively benchmarks the empirical performance of recent causal discovery methods on observational i.i.d. data. We show that score matching-based methods demonstrate surprising performance in the false positive and false negative rate of the inferred graph. We hope this paper will set a new standard for the evaluation of causal discovery methods.
arXiv Detail & Related papers (2023-10-20T09:56:07Z)
On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs. Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z)
Auditing Fairness by Betting [43.515287900510934]
We provide practical, efficient, and nonparametric methods for auditing the fairness of deployed classification and regression models.<n>Our methods are sequential and allow for the continuous monitoring of incoming data.<n>We demonstrate the efficacy of our approach on three benchmark fairness datasets.
arXiv Detail & Related papers (2023-05-27T20:14:11Z)
Mitigating Algorithmic Bias with Limited Annotations [65.060639928772]
When sensitive attributes are not disclosed or available, it is needed to manually annotate a small part of the training data to mitigate bias. We propose Active Penalization Of Discrimination (APOD), an interactive framework to guide the limited annotations towards maximally eliminating the effect of algorithmic bias. APOD shows comparable performance to fully annotated bias mitigation, which demonstrates that APOD could benefit real-world applications when sensitive information is limited.
arXiv Detail & Related papers (2022-07-20T16:31:19Z)
A Sandbox Tool to Bias(Stress)-Test Fairness Algorithms [19.86635585740634]
We present the conceptual idea and a first implementation of a bias-injection sandbox tool to investigate fairness consequences of various biases. Unlike existing toolkits, ours provides a controlled environment to counterfactually inject biases in the ML pipeline. In particular, we can test whether a given remedy can alleviate the injected bias by comparing the predictions resulting after the intervention with true labels in the unbiased regime-that is, before any bias injection.
arXiv Detail & Related papers (2022-04-21T16:12:19Z)
Information-Theoretic Bias Reduction via Causal View of Spurious Correlation [71.9123886505321]
We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation. We present a novel debiasing framework against the algorithmic bias, which incorporates a bias regularization loss. The proposed bias measurement and debiasing approaches are validated in diverse realistic scenarios.
arXiv Detail & Related papers (2022-01-10T01:19:31Z)
Adaptive Data Debiasing through Bounded Exploration and Fairness [19.082622108240585]
Biases in existing datasets used to train algorithmic decision rules can raise ethical, societal, and economic concerns. We propose an algorithm for sequentially debiasing such datasets through adaptive and bounded exploration.
arXiv Detail & Related papers (2021-10-25T15:50:10Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management. We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.