Argumentative Debates for Transparent Bias Detection [Technical Report]
- URL: http://arxiv.org/abs/2508.04511v1
- Date: Wed, 06 Aug 2025 14:56:08 GMT
- Title: Argumentative Debates for Transparent Bias Detection [Technical Report]
- Authors: Hamed Ayoobi, Nico Potyka, Anna Rapberger, Francesca Toni,
- Abstract summary: We propose a novel interpretable, explainable method for bias detection relying on debates about the presence of bias against individuals.<n>Our method builds upon techniques from formal and computational argumentation, whereby debates result from arguing about biases within and across neighbourhoods.<n>We provide formal, quantitative, and qualitative evaluations of our method, highlighting its strengths as well as its interpretability and explainability.
- Score: 18.27485896306961
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As the use of AI systems in society grows, addressing potential biases that emerge from data or are learned by models is essential to prevent systematic disadvantages against specific groups. Several notions of (un)fairness have been proposed in the literature, alongside corresponding algorithmic methods for detecting and mitigating unfairness, but, with very few exceptions, these tend to ignore transparency. Instead, interpretability and explainability are core requirements for algorithmic fairness, even more so than for other algorithmic solutions, given the human-oriented nature of fairness. In this paper, we contribute a novel interpretable, explainable method for bias detection relying on debates about the presence of bias against individuals, based on the values of protected features for the individuals and others in their neighbourhoods. Our method builds upon techniques from formal and computational argumentation, whereby debates result from arguing about biases within and across neighbourhoods. We provide formal, quantitative, and qualitative evaluations of our method, highlighting its strengths in performance against baselines, as well as its interpretability and explainability.
Related papers
- Explanations as Bias Detectors: A Critical Study of Local Post-hoc XAI Methods for Fairness Exploration [5.113545724516812]
This paper explores how explainability methods can be leveraged to detect and interpret unfairness.<n>We propose a pipeline that integrates local post-hoc explanation methods to derive fairness-related insights.
arXiv Detail & Related papers (2025-05-01T19:03:18Z) - On the Fairness, Diversity and Reliability of Text-to-Image Generative Models [68.62012304574012]
multimodal generative models have sparked critical discussions on their reliability, fairness and potential for misuse.<n>We propose an evaluation framework to assess model reliability by analyzing responses to global and local perturbations in the embedding space.<n>Our method lays the groundwork for detecting unreliable, bias-injected models and tracing the provenance of embedded biases.
arXiv Detail & Related papers (2024-11-21T09:46:55Z) - Advancing Fairness in Natural Language Processing: From Traditional Methods to Explainability [0.9065034043031668]
The thesis addresses the need for equity and transparency in NLP systems.
It introduces an innovative algorithm to mitigate biases in high-risk NLP applications.
It also presents a model-agnostic explainability method that identifies and ranks concepts in Transformer models.
arXiv Detail & Related papers (2024-10-16T12:38:58Z) - Peer-induced Fairness: A Causal Approach for Algorithmic Fairness Auditing [0.0]
The European Union's Artificial Intelligence Act takes effect on 1 August 2024.
High-risk AI applications must adhere to stringent transparency and fairness standards.
We propose a novel framework, which combines the strengths of counterfactual fairness and peer comparison strategy.
arXiv Detail & Related papers (2024-08-05T15:35:34Z) - Parametric Fairness with Statistical Guarantees [0.46040036610482665]
We extend the concept of Demographic Parity to incorporate distributional properties in predictions, allowing expert knowledge to be used in the fair solution.
We illustrate the use of this new metric through a practical example of wages, and develop a parametric method that efficiently addresses practical challenges.
arXiv Detail & Related papers (2023-10-31T14:52:39Z) - Identifying Reasons for Bias: An Argumentation-Based Approach [2.9465623430708905]
We propose a novel model-agnostic argumentation-based method to determine why an individual is classified differently in comparison to similar individuals.
We evaluate our method on two datasets commonly used in the fairness literature and illustrate its effectiveness in the identification of bias.
arXiv Detail & Related papers (2023-10-25T09:47:15Z) - Fairness Explainability using Optimal Transport with Applications in
Image Classification [0.46040036610482665]
We propose a comprehensive approach to uncover the causes of discrimination in Machine Learning applications.
We leverage Wasserstein barycenters to achieve fair predictions and introduce an extension to pinpoint bias-associated regions.
This allows us to derive a cohesive system which uses the enforced fairness to measure each features influence emphon the bias.
arXiv Detail & Related papers (2023-08-22T00:10:23Z) - Fair Enough: Standardizing Evaluation and Model Selection for Fairness
Research in NLP [64.45845091719002]
Modern NLP systems exhibit a range of biases, which a growing literature on model debiasing attempts to correct.
This paper seeks to clarify the current situation and plot a course for meaningful progress in fair learning.
arXiv Detail & Related papers (2023-02-11T14:54:00Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Conditional Supervised Contrastive Learning for Fair Text Classification [59.813422435604025]
We study learning fair representations that satisfy a notion of fairness known as equalized odds for text classification via contrastive learning.
Specifically, we first theoretically analyze the connections between learning representations with a fairness constraint and conditional supervised contrastive objectives.
arXiv Detail & Related papers (2022-05-23T17:38:30Z) - Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features.
We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z) - Towards causal benchmarking of bias in face analysis algorithms [54.19499274513654]
We develop an experimental method for measuring algorithmic bias of face analysis algorithms.
Our proposed method is based on generating synthetic transects'' of matched sample images.
We validate our method by comparing it to a study that employs the traditional observational method for analyzing bias in gender classification algorithms.
arXiv Detail & Related papers (2020-07-13T17:10:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.