Related papers: Assessing AI Utility: The Random Guesser Test for Sequential Decision-Making Systems

Assessing AI Utility: The Random Guesser Test for Sequential Decision-Making Systems

URL: http://arxiv.org/abs/2407.20276v2
Date: Sun, 11 Aug 2024 13:56:58 GMT
Title: Assessing AI Utility: The Random Guesser Test for Sequential Decision-Making Systems
Authors: Shun Ide, Allison Blunt, Djallel Bouneffouf,
Abstract summary: We propose a general approach to assessing the risk and vulnerability of artificial intelligence (AI) systems to biased decisions. The guiding principle of the proposed approach is that any AI algorithm must outperform a random guesser. We highlight that modern recommender systems may exhibit a similar tendency to favor overly low-risk options.
Score: 5.62395683551121
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a general approach to quantitatively assessing the risk and vulnerability of artificial intelligence (AI) systems to biased decisions. The guiding principle of the proposed approach is that any AI algorithm must outperform a random guesser. This may appear trivial, but empirical results from a simplistic sequential decision-making scenario involving roulette games show that sophisticated AI-based approaches often underperform the random guesser by a significant margin. We highlight that modern recommender systems may exhibit a similar tendency to favor overly low-risk options. We argue that this "random guesser test" can serve as a useful tool for evaluating the utility of AI actions, and also points towards increasing exploration as a potential improvement to such systems.

Related papers

Can you trust your explanations? A robustness test for feature attribution methods [42.36530107262305]
The field of Explainable AI (XAI) has seen a rapid growth but the usage of its techniques has at times led to unexpected results. We will show how leveraging manifold hypothesis and ensemble approaches can be beneficial to an in-depth analysis of the robustness.
arXiv Detail & Related papers (2024-06-20T14:17:57Z)
Does AI help humans make better decisions? A statistical evaluation framework for experimental and observational studies [0.43981305860983716]
We show how to compare the performance of three alternative decision-making systems--human-alone, human-with-AI, and AI-alone. We find that the risk assessment recommendations do not improve the classification accuracy of a judge's decision to impose cash bail.
arXiv Detail & Related papers (2024-03-18T01:04:52Z)
Beyond Expectations: Learning with Stochastic Dominance Made Practical [88.06211893690964]
dominance models risk-averse preferences for decision making with uncertain outcomes. Despite theoretically appealing, the application of dominance in machine learning has been scarce. We first generalize the dominance concept to enable feasible comparisons between any arbitrary pair of random variables. We then develop a simple and efficient approach for finding the optimal solution in terms of dominance.
arXiv Detail & Related papers (2024-02-05T03:21:23Z)
Risk-Sensitive Stochastic Optimal Control as Rao-Blackwellized Markovian Score Climbing [3.9410617513331863]
optimal control of dynamical systems is a crucial challenge in sequential decision-making. Control-as-inference approaches have had considerable success, providing a viable risk-sensitive framework to address the exploration-exploitation dilemma. This paper introduces a novel perspective by framing risk-sensitive control as Markovian reinforcement score climbing under samples drawn from a conditional particle filter.
arXiv Detail & Related papers (2023-12-21T16:34:03Z)
Measuring Bias in AI Models: An Statistical Approach Introducing N-Sigma [19.072543709069087]
We analyze statistical approaches to measure biases in automatic decision-making systems. We propose a novel way to measure the biases in machine learning models using a statistical approach based on the N-Sigma method.
arXiv Detail & Related papers (2023-04-26T16:49:25Z)
Model Predictive Control with Gaussian-Process-Supported Dynamical Constraints for Autonomous Vehicles [82.65261980827594]
We propose a model predictive control approach for autonomous vehicles that exploits learned Gaussian processes for predicting human driving behavior. A multi-mode predictive control approach considers the possible intentions of the human drivers.
arXiv Detail & Related papers (2023-03-08T17:14:57Z)
Large-Scale Sequential Learning for Recommender and Engineering Systems [91.3755431537592]
In this thesis, we focus on the design of an automatic algorithms that provide personalized ranking by adapting to the current conditions. For the former, we propose novel algorithm called SAROS that take into account both kinds of feedback for learning over the sequence of interactions. The proposed idea of taking into account the neighbour lines shows statistically significant results in comparison with the initial approach for faults detection in power grid.
arXiv Detail & Related papers (2022-05-13T21:09:41Z)
MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems. We propose a novel method for computing the normalized maximum likelihood (NML) distribution. We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z)
Policy Gradient Bayesian Robust Optimization for Imitation Learning [49.881386773269746]
We derive a novel policy gradient-style robust optimization approach, PG-BROIL, to balance expected performance and risk. Results suggest PG-BROIL can produce a family of behaviors ranging from risk-neutral to risk-averse.
arXiv Detail & Related papers (2021-06-11T16:49:15Z)
Efficient falsification approach for autonomous vehicle validation using a parameter optimisation technique based on reinforcement learning [6.198523595657983]
The widescale deployment of Autonomous Vehicles (AV) appears to be imminent despite many safety challenges that are yet to be resolved. The uncertainties in the behaviour of the traffic participants and the dynamic world cause reactions in advanced autonomous systems. This paper presents an efficient falsification method to evaluate the System Under Test.
arXiv Detail & Related papers (2020-11-16T02:56:13Z)
Multimodal Safety-Critical Scenarios Generation for Decision-Making Algorithms Evaluation [23.43175124406634]
Existing neural network-based autonomous systems are shown to be vulnerable against adversarial attacks. We propose a flow-based multimodal safety-critical scenario generator for evaluating decisionmaking algorithms. We evaluate six Reinforcement Learning algorithms with our generated traffic scenarios and provide empirical conclusions about their robustness.
arXiv Detail & Related papers (2020-09-16T15:16:43Z)
A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores [85.12096045419686]
We study the adoption of an algorithmic tool used to assist child maltreatment hotline screening decisions. We first show that humans do alter their behavior when the tool is deployed. We show that humans are less likely to adhere to the machine's recommendation when the score displayed is an incorrect estimate of risk.
arXiv Detail & Related papers (2020-02-19T07:27:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.