Related papers: Applications and Challenges of Fairness APIs in Machine Learning Software

Applications and Challenges of Fairness APIs in Machine Learning Software

URL: http://arxiv.org/abs/2508.16377v1
Date: Fri, 22 Aug 2025 13:33:37 GMT
Title: Applications and Challenges of Fairness APIs in Machine Learning Software
Authors: Ajoy Das, Gias Uddin, Shaiful Chowdhury, Mostafijur Rahman Akhond, Hadi Hemmati,
Abstract summary: bias detection and mitigation open-source software libraries (aka API libraries) are being developed and used.<n>In this paper, we conduct a qualitative study to understand in what scenarios these open-source fairness APIs are used in the wild.<n>We analyzed 204 GitHub repositories which used 13 APIs that are developed to address bias in ML software.
Score: 3.3383488302533997
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine Learning software systems are frequently used in our day-to-day lives. Some of these systems are used in various sensitive environments to make life-changing decisions. Therefore, it is crucial to ensure that these AI/ML systems do not make any discriminatory decisions for any specific groups or populations. In that vein, different bias detection and mitigation open-source software libraries (aka API libraries) are being developed and used. In this paper, we conduct a qualitative study to understand in what scenarios these open-source fairness APIs are used in the wild, how they are used, and what challenges the developers of these APIs face while developing and adopting these libraries. We have analyzed 204 GitHub repositories (from a list of 1885 candidate repositories) which used 13 APIs that are developed to address bias in ML software. We found that these APIs are used for two primary purposes (i.e., learning and solving real-world problems), targeting 17 unique use-cases. Our study suggests that developers are not well-versed in bias detection and mitigation; they face lots of troubleshooting issues, and frequently ask for opinions and resources. Our findings can be instrumental for future bias-related software engineering research, and for guiding educators in developing more state-of-the-art curricula.

Related papers

Framework-Aware Code Generation with API Knowledge Graph-Constructed Data: A Study on HarmonyOS [52.483888557864326]
APIKG4SYN is a framework designed to exploit API knowledge graphs for the construction of API-oriented question-code pairs.<n>We build the first benchmark for HarmonyOS code generation using APIKG4SYN.
arXiv Detail & Related papers (2025-11-29T08:13:54Z)
Towards Supporting Open Source Library Maintainers with Community-Based Analytics [1.4078020083560923]
We propose the use of community-based analytics to analyze how an OSS library is used across its dependent ecosystem.<n>Our results reveal that while library developers offer a wide range of API methods, only 16% are actively used by their dependent ecosystem.<n>We propose two metrics to help developers evaluate their test suite according to the APIs used by their community.
arXiv Detail & Related papers (2025-10-17T16:15:59Z)
Rethinking Technology Stack Selection with AI Coding Proficiency [49.617080246389605]
Large language models (LLMs) are now an integral part of software development.<n>We propose the concept, AI coding proficiency, the degree to which LLMs can utilize a given technology to generate high-quality code snippets.<n>We conduct the first comprehensive empirical study examining AI proficiency across 170 third-party libraries and 61 task scenarios.
arXiv Detail & Related papers (2025-09-14T06:56:47Z)
Identifying and Mitigating API Misuse in Large Language Models [26.4403427473915]
API misuse in code generated by large language models (LLMs) represents a serious emerging challenge in software development.<n>This paper presents the first comprehensive study of API misuse patterns in LLM-generated code, analyzing both method selection and parameter usage across Python and Java.<n>We propose Dr.Fix, a novel LLM-based automatic program repair approach for API misuse based on the aforementioned taxonomy.
arXiv Detail & Related papers (2025-03-28T18:43:12Z)
Your Fix Is My Exploit: Enabling Comprehensive DL Library API Fuzzing with Large Language Models [49.214291813478695]
Deep learning (DL) libraries, widely used in AI applications, often contain vulnerabilities like overflows and use buffer-free errors.<n>Traditional fuzzing struggles with the complexity and API diversity of DL libraries.<n>We propose DFUZZ, an LLM-driven fuzzing approach for DL libraries.
arXiv Detail & Related papers (2025-01-08T07:07:22Z)
ExploraCoder: Advancing code generation for multiple unseen APIs via planning and chained exploration [70.26807758443675]
ExploraCoder is a training-free framework that empowers large language models to invoke unseen APIs in code solution.<n> Experimental results demonstrate that ExploraCoder significantly improves performance for models lacking prior API knowledge.
arXiv Detail & Related papers (2024-12-06T19:00:15Z)
A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models [14.665460257371164]
Large language models (LLMs) like GitHub Copilot and ChatGPT have emerged as powerful tools for code generation. We propose AutoAPIEval, a framework designed to evaluate the capabilities of LLMs in API-oriented code generation.
arXiv Detail & Related papers (2024-09-23T17:22:09Z)
A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How [53.65636914757381]
API suggestion is a critical task in modern software development. Recent advancements in large code models (LCMs) have shown promise in the API suggestion task.
arXiv Detail & Related papers (2024-09-20T03:12:35Z)
Retrieval-Augmented Test Generation: How Far Are We? [8.84734567720785]
Retrieval Augmented Generation (RAG) has shown notable advancements in software engineering tasks. To bridge this gap, we take the initiative to investigate the efficacy of RAG-based LLMs in test generation. Specifically, we examine RAG built upon three types of domain knowledge: 1) API documentation, 2) GitHub issues, and 3) StackOverflow Q&As.
arXiv Detail & Related papers (2024-09-19T11:48:29Z)
An Empirical Study of API Misuses of Data-Centric Libraries [9.667988837321943]
This paper contributes an empirical study of API misuses of five data-centric libraries that cover areas such as data processing, numerical computation, machine learning, and visualization. We identify misuses of these libraries by analyzing data from both Stack Overflow and GitHub.
arXiv Detail & Related papers (2024-08-28T15:15:52Z)
Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries. We propose a novel framework that emulates the process of programmers writing private code. We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z)
Do Not Take It for Granted: Comparing Open-Source Libraries for Software Development Effort Estimation [9.224578642189023]
This paper aims at raising awareness of the differences incurred when using different Machine Learning (ML) libraries for software development effort estimation (SEE) We investigate 4 deterministic machine learners as provided by 3 of the most popular ML open-source libraries written in different languages (namely, Scikit-Learn, Caret and Weka) The results of our study reveal that the predictions provided by the 3 libraries differ in 95% of the cases on average across a total of 105 cases studied.
arXiv Detail & Related papers (2022-07-04T20:06:40Z)
Federated and continual learning for classification tasks in a society of devices [59.45414406974091]
Light Federated and Continual Consensus (LFedCon2) is a new federated and continual architecture that uses light, traditional learners. Our method allows powerless devices (such as smartphones or robots) to learn in real time, locally, continuously, autonomously and from users. In order to test our proposal, we have applied it in a heterogeneous community of smartphone users to solve the problem of walking recognition.
arXiv Detail & Related papers (2020-06-12T12:37:03Z)
Interpreting Cloud Computer Vision Pain-Points: A Mining Study of Stack Overflow [5.975695375814528]
This study investigates developers' frustrations with computer vision services. We find that unlike mature fields like mobile development, there is a contrast in the types of questions asked by developers. These indicate a shallow understanding of the technology that empower such systems.
arXiv Detail & Related papers (2020-01-28T00:56:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.