Related papers: On Explaining Proxy Discrimination and Unfairness in Individual Decisions Made by AI Systems

On Explaining Proxy Discrimination and Unfairness in Individual Decisions Made by AI Systems

URL: http://arxiv.org/abs/2509.25662v1
Date: Tue, 30 Sep 2025 01:58:59 GMT
Title: On Explaining Proxy Discrimination and Unfairness in Individual Decisions Made by AI Systems
Authors: Belona Sonna, Alban Grastien,
Abstract summary: We propose a novel framework using formal abductive explanations to explain proxy discrimination in individual AI decisions.<n>Our method identifies which features act as unjustified proxies for protected attributes, revealing hidden structural biases.<n>As a proof of concept, we showcase the framework with examples taken from the German credit dataset.
Score: 5.220940151628734
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Artificial intelligence (AI) systems in high-stakes domains raise concerns about proxy discrimination, unfairness, and explainability. Existing audits often fail to reveal why unfairness arises, particularly when rooted in structural bias. We propose a novel framework using formal abductive explanations to explain proxy discrimination in individual AI decisions. Leveraging background knowledge, our method identifies which features act as unjustified proxies for protected attributes, revealing hidden structural biases. Central to our approach is the concept of aptitude, a task-relevant property independent of group membership, with a mapping function aligning individuals of equivalent aptitude across groups to assess fairness substantively. As a proof of concept, we showcase the framework with examples taken from the German credit dataset, demonstrating its applicability in real-world cases.

Related papers

Partial Identification Approach to Counterfactual Fairness Assessment [50.88100567472179]
We introduce a Bayesian approach to bound unknown counterfactual fairness measures with high confidence.<n>Our results reveal a positive (spurious) effect on the COMPAS score when changing race to African-American (from all others) and a negative (direct causal) effect when transitioning from young to old age.
arXiv Detail & Related papers (2025-09-30T18:35:08Z)
Underrepresentation, Label Bias, and Proxies: Towards Data Bias Profiles for the EU AI Act and Beyond [42.710392315326104]
We present three common data biases and study their individual and joint effect on algorithmic discrimination.<n>We develop dedicated mechanisms to detect specific types of bias, and combine them into a preliminary construct we refer to as the Data Bias Profile (DBP)<n>This initial formulation serves as a proof of concept for how different bias signals can be systematically documented.
arXiv Detail & Related papers (2025-07-09T15:52:11Z)
Am I Being Treated Fairly? A Conceptual Framework for Individuals to Ascertain Fairness [0.7783262415147651]
We argue for the reification of fairness as a property of Automatic Decision Making (ADM) systems.<n>We propose a conceptual framework to ascertain fairness by combining different tools that empower the end-users of ADM systems.
arXiv Detail & Related papers (2025-04-03T10:28:19Z)
AI Fairness in Practice [0.46671368497079174]
There is a broad spectrum of views across society on what the concept of fairness means and how it should be put to practice. This workbook explores how a context-based approach to understanding AI Fairness can help project teams better identify, mitigate, and manage the many ways that unfair bias and discrimination can crop up across the AI project workflow.
arXiv Detail & Related papers (2024-02-19T23:02:56Z)
Evaluating the Fairness of Discriminative Foundation Models in Computer Vision [51.176061115977774]
We propose a novel taxonomy for bias evaluation of discriminative foundation models, such as Contrastive Language-Pretraining (CLIP) We then systematically evaluate existing methods for mitigating bias in these models with respect to our taxonomy. Specifically, we evaluate OpenAI's CLIP and OpenCLIP models for key applications, such as zero-shot classification, image retrieval and image captioning.
arXiv Detail & Related papers (2023-10-18T10:32:39Z)
The Impact of Explanations on Fairness in Human-AI Decision-Making: Protected vs Proxy Features [25.752072910748716]
Explanations may help human-AI teams address biases for fairer decision-making. We study the effect of the presence of protected and proxy features on participants' perception of model fairness. We find that explanations help people detect direct but not indirect biases.
arXiv Detail & Related papers (2023-10-12T16:00:16Z)
Fairness Explainability using Optimal Transport with Applications in Image Classification [0.46040036610482665]
We propose a comprehensive approach to uncover the causes of discrimination in Machine Learning applications. We leverage Wasserstein barycenters to achieve fair predictions and introduce an extension to pinpoint bias-associated regions. This allows us to derive a cohesive system which uses the enforced fairness to measure each features influence emphon the bias.
arXiv Detail & Related papers (2023-08-22T00:10:23Z)
Fair Decision-making Under Uncertainty [1.5688552250473473]
We study a longitudinal censored learning problem subject to fairness constraints. We show how the newly devised fairness notions involving censored information and the general framework for fair predictions in the presence of censorship allow us to measure and discrimination under uncertainty.
arXiv Detail & Related papers (2023-01-29T05:42:39Z)
Causal Fairness Analysis [68.12191782657437]
We introduce a framework for understanding, modeling, and possibly solving issues of fairness in decision-making settings. The main insight of our approach will be to link the quantification of the disparities present on the observed data with the underlying, and often unobserved, collection of causal mechanisms. Our effort culminates in the Fairness Map, which is the first systematic attempt to organize and explain the relationship between different criteria found in the literature.
arXiv Detail & Related papers (2022-07-23T01:06:34Z)
Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture. We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z)
Estimating and Improving Fairness with Adversarial Learning [65.99330614802388]
We propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system. Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model. We evaluate our framework on a large-scale public-available skin lesion dataset.
arXiv Detail & Related papers (2021-03-07T03:10:32Z)
Towards Robust Fine-grained Recognition by Maximal Separation of Discriminative Features [72.72840552588134]
We identify the proximity of the latent representations of different classes in fine-grained recognition networks as a key factor to the success of adversarial attacks. We introduce an attention-based regularization mechanism that maximally separates the discriminative latent features of different classes.
arXiv Detail & Related papers (2020-06-10T18:34:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.