Measuring Bias in AI Models: An Statistical Approach Introducing N-Sigma
- URL: http://arxiv.org/abs/2304.13680v2
- Date: Wed, 24 May 2023 10:48:56 GMT
- Title: Measuring Bias in AI Models: An Statistical Approach Introducing N-Sigma
- Authors: Daniel DeAlcala, Ignacio Serna, Aythami Morales, Julian Fierrez,
Javier Ortega-Garcia
- Abstract summary: We analyze statistical approaches to measure biases in automatic decision-making systems.
We propose a novel way to measure the biases in machine learning models using a statistical approach based on the N-Sigma method.
- Score: 19.072543709069087
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The new regulatory framework proposal on Artificial Intelligence (AI)
published by the European Commission establishes a new risk-based legal
approach. The proposal highlights the need to develop adequate risk assessments
for the different uses of AI. This risk assessment should address, among
others, the detection and mitigation of bias in AI. In this work we analyze
statistical approaches to measure biases in automatic decision-making systems.
We focus our experiments in face recognition technologies. We propose a novel
way to measure the biases in machine learning models using a statistical
approach based on the N-Sigma method. N-Sigma is a popular statistical approach
used to validate hypotheses in general science such as physics and social areas
and its application to machine learning is yet unexplored. In this work we
study how to apply this methodology to develop new risk assessment frameworks
based on bias analysis and we discuss the main advantages and drawbacks with
respect to other popular statistical tests.
Related papers
- Adapting Probabilistic Risk Assessment for AI [0.0]
General-purpose artificial intelligence (AI) systems present an urgent risk management challenge.
Current methods often rely on selective testing and undocumented assumptions about risk priorities.
This paper introduces the probabilistic risk assessment (PRA) for AI framework.
arXiv Detail & Related papers (2025-04-25T17:59:14Z) - Mapping AI Benchmark Data to Quantitative Risk Estimates Through Expert Elicitation [0.7889270818022226]
We show how existing AI benchmarks can be used to facilitate the creation of risk estimates.
We describe the results of a pilot study in which experts use information from Cybench, an AI benchmark, to generate probability estimates.
arXiv Detail & Related papers (2025-03-06T10:39:47Z) - Statistical Scenario Modelling and Lookalike Distributions for Multi-Variate AI Risk [0.6526824510982799]
We show how scenario modelling can be used to model AI risk holistically.
We show how lookalike distributions from phenomena analogous to AI can be used to estimate AI impacts in the absence of directly observable data.
arXiv Detail & Related papers (2025-02-20T12:14:54Z) - Socio-Economic Consequences of Generative AI: A Review of Methodological Approaches [0.0]
We identify the primary methodologies that may be used to help predict the economic and social impacts of generative AI adoption.
Through a comprehensive literature review, we uncover a range of methodologies poised to assess the multifaceted impacts of this technological revolution.
arXiv Detail & Related papers (2024-11-14T09:40:25Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.
Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.
However, the deployment of these agents in physical environments presents significant safety challenges.
This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - An evidence-based methodology for human rights impact assessment (HRIA) in the development of AI data-intensive systems [49.1574468325115]
We show that human rights already underpin the decisions in the field of data use.
This work presents a methodology and a model for a Human Rights Impact Assessment (HRIA)
The proposed methodology is tested in concrete case-studies to prove its feasibility and effectiveness.
arXiv Detail & Related papers (2024-07-30T16:27:52Z) - Assessing AI Utility: The Random Guesser Test for Sequential Decision-Making Systems [5.62395683551121]
We propose a general approach to assessing the risk and vulnerability of artificial intelligence (AI) systems to biased decisions.
The guiding principle of the proposed approach is that any AI algorithm must outperform a random guesser.
We highlight that modern recommender systems may exhibit a similar tendency to favor overly low-risk options.
arXiv Detail & Related papers (2024-07-25T13:44:22Z) - Evaluating the Effectiveness of Index-Based Treatment Allocation [42.040099398176665]
When resources are scarce, an allocation policy is needed to decide who receives a resource.
This paper introduces methods to evaluate index-based allocation policies using data from a randomized control trial.
arXiv Detail & Related papers (2024-02-19T01:55:55Z) - AI in Supply Chain Risk Assessment: A Systematic Literature Review and Bibliometric Analysis [0.0]
This study examines 1,903 articles from Google Scholar and Web of Science, with 54 studies selected through PRISMA guidelines.
Our findings reveal that ML models, including Random Forest, XGBoost, and hybrid approaches, significantly enhance risk prediction accuracy and adaptability in post-pandemic contexts.
The study underscores the necessity of dynamic strategies, interdisciplinary collaboration, and continuous model evaluation to address challenges such as data quality and interpretability.
arXiv Detail & Related papers (2023-12-12T17:47:51Z) - Unmasking Bias in AI: A Systematic Review of Bias Detection and Mitigation Strategies in Electronic Health Record-based Models [6.300835344100545]
Leveraging artificial intelligence in conjunction with electronic health records holds transformative potential to improve healthcare.
Yet, addressing bias in AI, which risks worsening healthcare disparities, cannot be overlooked.
This study reviews methods to detect and mitigate diverse forms of bias in AI models developed using EHR data.
arXiv Detail & Related papers (2023-10-30T18:29:15Z) - Distribution-free risk assessment of regression-based machine learning
algorithms [6.507711025292814]
We focus on regression algorithms and the risk-assessment task of computing the probability of the true label lying inside an interval defined around the model's prediction.
We solve the risk-assessment problem using the conformal prediction approach, which provides prediction intervals that are guaranteed to contain the true label with a given probability.
arXiv Detail & Related papers (2023-10-05T13:57:24Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Deep Feature Statistics Mapping for Generalized Screen Content Image Quality Assessment [60.88265569998563]
We make the first attempt to learn the statistics of SCIs, based upon which the quality of SCIs can be effectively determined.
We empirically show that the statistics deviation could be effectively leveraged in quality assessment.
arXiv Detail & Related papers (2022-09-12T15:26:13Z) - Online Bootstrap Inference For Policy Evaluation in Reinforcement
Learning [90.59143158534849]
The recent emergence of reinforcement learning has created a demand for robust statistical inference methods.
Existing methods for statistical inference in online learning are restricted to settings involving independently sampled observations.
The online bootstrap is a flexible and efficient approach for statistical inference in linear approximation algorithms, but its efficacy in settings involving Markov noise has yet to be explored.
arXiv Detail & Related papers (2021-08-08T18:26:35Z) - Multi Agent System for Machine Learning Under Uncertainty in Cyber
Physical Manufacturing System [78.60415450507706]
Recent advancements in predictive machine learning has led to its application in various use cases in manufacturing.
Most research focused on maximising predictive accuracy without addressing the uncertainty associated with it.
In this paper, we determine the sources of uncertainty in machine learning and establish the success criteria of a machine learning system to function well under uncertainty.
arXiv Detail & Related papers (2021-07-28T10:28:05Z) - SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics.
We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations.
We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.