Related papers: Practitioner Insights on Fairness Requirements in the AI Development Life Cycle: An Interview Study

Practitioner Insights on Fairness Requirements in the AI Development Life Cycle: An Interview Study

URL: http://arxiv.org/abs/2512.13830v1
Date: Mon, 15 Dec 2025 19:12:34 GMT
Title: Practitioner Insights on Fairness Requirements in the AI Development Life Cycle: An Interview Study
Authors: Chaima Boufaied, Thanh Nguyen, Ronnie de Souza Santos,
Abstract summary: We conducted research on fairness requirements in AI from software engineering perspective.<n>Our study assesses the participants' awareness of fairness in AI / ML software and its application within the Software Development Life Cycle (SDLC)<n>Findings show that while our participants recognize the aforementioned AI fairness dimensions, practices are inconsistent, and fairness is often deprioritized with noticeable knowledge gaps.
Score: 3.5429774642987915
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Nowadays, Artificial Intelligence (AI), particularly Machine Learning (ML) and Large Language Models (LLMs), is widely applied across various contexts. However, the corresponding models often operate as black boxes, leading them to unintentionally act unfairly towards different demographic groups. This has led to a growing focus on fairness in AI software recently, alongside the traditional focus on the effectiveness of AI models. Through 26 semi-structured interviews with practitioners from different application domains and with varied backgrounds across 23 countries, we conducted research on fairness requirements in AI from software engineering perspective. Our study assesses the participants' awareness of fairness in AI / ML software and its application within the Software Development Life Cycle (SDLC), from translating fairness concerns into requirements to assessing their arising early in the SDLC. It also examines fairness through the key assessment dimensions of implementation, validation, evaluation, and how it is balanced with trade-offs involving other priorities, such as addressing all the software functionalities and meeting critical delivery deadlines. Findings of our thematic qualitative analysis show that while our participants recognize the aforementioned AI fairness dimensions, practices are inconsistent, and fairness is often deprioritized with noticeable knowledge gaps. This highlights the need for agreement with relevant stakeholders on well-defined, contextually appropriate fairness definitions, the corresponding evaluation metrics, and formalized processes to better integrate fairness into AI/ML projects.

Related papers

A Gray Literature Study on Fairness Requirements in AI-enabled Software Engineering [3.5429774642987915]
This paper presents a review of existing gray literature, examining fairness requirements in AI context.<n>Our gray literature investigation shows various definitions of fairness requirements in AI systems.<n>Fairness requirement violations are frequently linked, but not limited, to data representation bias, algorithmic and model design bias, human judgment, and evaluation and transparency gaps.
arXiv Detail & Related papers (2025-12-08T19:22:01Z)
An Approach to Grounding AI Model Evaluations in Human-derived Criteria [0.0]
We propose a novel approach to augment existing benchmarks with human-derived evaluation criteria.<n>Grounding our study in the Perception Test and OpenEQA benchmarks, we conducted in-depth interviews and large-scale surveys.<n>Our findings reveal that participants perceive AI as lacking in interpretive and empathetic skills yet hold high expectations for AI performance.
arXiv Detail & Related papers (2025-09-04T21:40:32Z)
Software Fairness Testing in Practice [0.21427777919040417]
This study investigates how software professionals test AI-powered systems for fairness through interviews with 22 practitioners working on AI and ML projects.<n>Our findings highlight a significant gap between theoretical fairness concepts and industry practice.<n>Key challenges include data quality and diversity, time constraints, defining effective metrics, and ensuring model interoperability.
arXiv Detail & Related papers (2025-06-20T16:03:02Z)
The AI Imperative: Scaling High-Quality Peer Review in Machine Learning [49.87236114682497]
We argue that AI-assisted peer review must become an urgent research and infrastructure priority.<n>We propose specific roles for AI in enhancing factual verification, guiding reviewer performance, assisting authors in quality improvement, and supporting ACs in decision-making.
arXiv Detail & Related papers (2025-06-09T18:37:14Z)
An Overview of Large Language Models for Statisticians [109.38601458831545]
Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI)<n>This paper explores potential areas where statisticians can make important contributions to the development of LLMs.<n>We focus on issues such as uncertainty quantification, interpretability, fairness, privacy, watermarking and model adaptation.
arXiv Detail & Related papers (2025-02-25T03:40:36Z)
AI-generated Image Quality Assessment in Visual Communication [72.11144790293086]
AIGI-VC is a quality assessment database for AI-generated images in visual communication.<n>The dataset consists of 2,500 images spanning 14 advertisement topics and 8 emotion types.<n>It provides coarse-grained human preference annotations and fine-grained preference descriptions, benchmarking the abilities of IQA methods in preference prediction, interpretation, and reasoning.
arXiv Detail & Related papers (2024-12-20T08:47:07Z)
Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.<n>We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.<n>We propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark.
arXiv Detail & Related papers (2024-10-24T17:56:08Z)
The Impossibility of Fair LLMs [17.812295963158714]
We analyze a variety of technical fairness frameworks and find inherent challenges in each that make the development of a fair language model intractable.<n>We show that each framework either does not extend to the general-purpose AI context or is infeasible in practice.<n>These inherent challenges would persist for general-purpose AI, including LLMs, even if empirical challenges, such as limited participatory input and limited measurement methods, were overcome.
arXiv Detail & Related papers (2024-05-28T04:36:15Z)
Guideline for Trustworthy Artificial Intelligence -- AI Assessment Catalog [0.0]
It is clear that AI and business models based on it can only reach their full potential if AI applications are developed according to high quality standards. The issue of the trustworthiness of AI applications is crucial and is the subject of numerous major publications. This AI assessment catalog addresses exactly this point and is intended for two target groups.
arXiv Detail & Related papers (2023-06-20T08:07:18Z)
Fairness meets Cross-Domain Learning: a new perspective on Models and Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness. We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks. Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z)
Human-Centric Multimodal Machine Learning: Recent Advances and Testbed on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach. Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes. We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.