Related papers: Empowering Affected Individuals to Shape AI Fairness Assessments: Processes, Criteria, and Tools

Empowering Affected Individuals to Shape AI Fairness Assessments: Processes, Criteria, and Tools

URL: http://arxiv.org/abs/2602.06984v1
Date: Tue, 27 Jan 2026 12:51:01 GMT
Title: Empowering Affected Individuals to Shape AI Fairness Assessments: Processes, Criteria, and Tools
Authors: Lin Luo, Satwik Ghanta, Yuri Nakao, Mathieu Chollet, Simone Stumpf,
Abstract summary: Existing fairness assessments are typically conducted by AI experts or regulators using protected attributes and metrics.<n>Recent work has called for involving affected individuals in fairness assessment, yet little empirical evidence exists on how they create their own fairness criteria.
Score: 5.72357951997548
License: http://creativecommons.org/licenses/by/4.0/
Abstract: AI systems are increasingly used in high-stakes domains such as credit rating, where fairness concerns are critical. Existing fairness assessments are typically conducted by AI experts or regulators using predefined protected attributes and metrics, which often fail to capture the diversity and nuance of fairness notions held by the individuals who are affected by these systems' decisions, such as decision subjects. Recent work has therefore called for involving affected individuals in fairness assessment, yet little empirical evidence exists on how they create their own fairness criteria or what kinds of criteria they produce - knowledge that could not only inform experts' fairness evaluation and mitigation, but also guide the design of AI assessment tools. We address this gap through a qualitative user study with 18 participants in a credit rating scenario. Participants first articulated their fairness notions in their own words. Then, participants turned them into concrete quantified and operationalized fairness criteria, through an interactive prototype we designed. Our findings provide empirical evidence of the process through which people's fairness notions emerge via grounding in model features, and uncover a diverse set of individuals' custom-defined criteria for both outcome and procedural fairness. We provide design implications for processes and tools that support more inclusive and value-sensitive AI fairness assessment.

Related papers

Partial Identification Approach to Counterfactual Fairness Assessment [50.88100567472179]
We introduce a Bayesian approach to bound unknown counterfactual fairness measures with high confidence.<n>Our results reveal a positive (spurious) effect on the COMPAS score when changing race to African-American (from all others) and a negative (direct causal) effect when transitioning from young to old age.
arXiv Detail & Related papers (2025-09-30T18:35:08Z)
"I think this is fair'': Uncovering the Complexities of Stakeholder Decision-Making in AI Fairness Assessment [5.919313327612488]
We show that stakeholders' fairness decisions are more complex than typical AI expert practices.<n>Our results extend the understanding of how stakeholders can meaningfully contribute to AI fairness governance and mitigation.
arXiv Detail & Related papers (2025-09-22T16:12:12Z)
AI Judges in Design: Statistical Perspectives on Achieving Human Expert Equivalence With Vision-Language Models [3.092385483349516]
This paper introduces a rigorous statistical framework to determine whether an AI judge's ratings match those of human experts.<n>We apply this framework in a case study evaluating four VLM-based judges on key design metrics.<n>Results show that the top-performing AI judge achieves expert-level agreement for uniqueness and drawing quality.
arXiv Detail & Related papers (2025-04-01T16:20:29Z)
On the Fairness, Diversity and Reliability of Text-to-Image Generative Models [68.62012304574012]
multimodal generative models have sparked critical discussions on their reliability, fairness and potential for misuse.<n>We propose an evaluation framework to assess model reliability by analyzing responses to global and local perturbations in the embedding space.<n>Our method lays the groundwork for detecting unreliable, bias-injected models and tracing the provenance of embedded biases.
arXiv Detail & Related papers (2024-11-21T09:46:55Z)
Fairness Evaluation with Item Response Theory [10.871079276188649]
This paper proposes a novel Fair-IRT framework to evaluate fairness in Machine Learning (ML) models. Detailed explanations for item characteristic curves (ICCs) are provided for particular individuals. Experiments demonstrate the effectiveness of this framework as a fairness evaluation tool.
arXiv Detail & Related papers (2024-10-20T22:25:20Z)
EARN Fairness: Explaining, Asking, Reviewing, and Negotiating Artificial Intelligence Fairness Metrics Among Stakeholders [5.216732191267959]
We propose a new framework, EARN Fairness, which facilitates collective metric decisions among stakeholders without requiring AI expertise.<n>The framework features an adaptable interactive system and a stakeholder-centered EARN Fairness process to explain fairness metrics, Ask stakeholders' personal metric preferences, Review metrics collectively, and Negotiate a consensus on metric selection.<n>Our work shows that the EARN Fairness framework enables stakeholders to express personal preferences and reach consensus, providing practical guidance for implementing human-centered AI fairness in high-risk contexts.
arXiv Detail & Related papers (2024-07-16T07:20:30Z)
Evaluating the Fairness of Discriminative Foundation Models in Computer Vision [51.176061115977774]
We propose a novel taxonomy for bias evaluation of discriminative foundation models, such as Contrastive Language-Pretraining (CLIP) We then systematically evaluate existing methods for mitigating bias in these models with respect to our taxonomy. Specifically, we evaluate OpenAI's CLIP and OpenCLIP models for key applications, such as zero-shot classification, image retrieval and image captioning.
arXiv Detail & Related papers (2023-10-18T10:32:39Z)
Fairness Evaluation in Text Classification: Machine Learning Practitioner Perspectives of Individual and Group Fairness [34.071324739205096]
We run a study with Machine Learning practitioners to understand the strategies used to evaluate models. We discover fairness assessment strategies involving personal experiences or how users form groups of identity tokens to test model fairness.
arXiv Detail & Related papers (2023-03-01T17:12:49Z)
Towards a multi-stakeholder value-based assessment framework for algorithmic systems [76.79703106646967]
We develop a value-based assessment framework that visualizes closeness and tensions between values. We give guidelines on how to operationalize them, while opening up the evaluation and deliberation process to a wide range of stakeholders.
arXiv Detail & Related papers (2022-05-09T19:28:32Z)
Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features. We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z)
Measuring Fairness Under Unawareness of Sensitive Attributes: A Quantification-Based Approach [131.20444904674494]
We tackle the problem of measuring group fairness under unawareness of sensitive attributes. We show that quantification approaches are particularly suited to tackle the fairness-under-unawareness problem.
arXiv Detail & Related papers (2021-09-17T13:45:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.