From Evidence to Decision: Exploring Evaluative AI
- URL: http://arxiv.org/abs/2402.01292v4
- Date: Wed, 27 Aug 2025 10:49:13 GMT
- Title: From Evidence to Decision: Exploring Evaluative AI
- Authors: Thao Le, Tim Miller, Liz Sonenberg, Ronal Singh, H. Peter Soyer,
- Abstract summary: We propose an implementation of Evaluative AI by extending the Weight of Evidence framework.<n>We demonstrate the application of the new decision-support approach in two domains: housing price prediction and skin cancer diagnosis.
- Score: 6.460500772980468
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a hypothesis-driven approach to improve AI-supported decision-making that is based on the Evaluative AI paradigm - a conceptual framework that proposes providing users with evidence for or against a given hypothesis. We propose an implementation of Evaluative AI by extending the Weight of Evidence framework, leading to hypothesis-driven models that support both tabular and image data. We demonstrate the application of the new decision-support approach in two domains: housing price prediction and skin cancer diagnosis. The findings show promising results in improving human decisions, as well as providing insights on the strengths and weaknesses of different decision-support approaches.
Related papers
- 2-Step Agent: A Framework for the Interaction of a Decision Maker with AI Decision Support [0.254890465057467]
We introduce a general computational framework, the 2-Step Agent, which models the effects of AI-assisted decision making.<n>Our results reveal several potential pitfalls of AI-driven decision support and highlight the need for thorough model documentation and proper user training.
arXiv Detail & Related papers (2026-02-25T13:11:12Z) - Towards a Cascaded LLM Framework for Cost-effective Human-AI Decision-Making [55.2480439325792]
We present a cascaded LLM decision framework that adaptively delegates tasks across multiple tiers of expertise.<n>First, a deferral policy determines whether to accept the base model's answer or regenerate it with the large model.<n>Second, an abstention policy decides whether the cascade model response is sufficiently certain or requires human intervention.
arXiv Detail & Related papers (2025-06-13T15:36:22Z) - When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration [79.69935257008467]
We introduce Knowledge Integration and Transfer Evaluation (KITE), a conceptual and experimental framework for Human-AI knowledge transfer capabilities.<n>We conduct the first large-scale human study (N=118) explicitly designed to measure it.<n>In our two-phase setup, humans first ideate with an AI on problem-solving strategies, then independently implement solutions, isolating model explanations' influence on human understanding.
arXiv Detail & Related papers (2025-06-05T20:48:16Z) - Supporting Data-Frame Dynamics in AI-assisted Decision Making [6.4219774981192455]
High stakes decision-making requires continuous interplay between evolving evidence and shifting hypotheses.
We introduce a mixed-initiative framework for AI assisted decision making that is grounded in the data-frame theory of sensemaking and the evaluative AI paradigm.
arXiv Detail & Related papers (2025-04-22T13:36:06Z) - Evaluating Prediction-based Interventions with Human Decision Makers In Mind [1.192656186481075]
We formalize and investigate various models of human decision-making in the presence of a predictive model aid.<n>We show that each of these behavioural models produces dependencies across decision subjects and results in the violation of existing assumptions.
arXiv Detail & Related papers (2025-02-12T20:35:52Z) - An Empirical Examination of the Evaluative AI Framework [0.0]
This study empirically examines the "Evaluative AI" framework, which aims to enhance the decision-making process for AI users.
Rather than offering direct recommendations, this framework presents users pro and con evidence for hypotheses to support more informed decisions.
arXiv Detail & Related papers (2024-11-13T13:03:49Z) - Explainable and Human-Grounded AI for Decision Support Systems: The Theory of Epistemic Quasi-Partnerships [0.0]
We argue that meeting the demands of ethical and explainable AI (XAI) is about developing AI-DSS to provide human decision-makers with three types of human-grounded explanations.
We demonstrate how current theories about what constitutes good human-grounded reasons either do not adequately explain this evidence or do not offer sound ethical advice for development.
arXiv Detail & Related papers (2024-09-23T09:14:25Z) - An evidence-based methodology for human rights impact assessment (HRIA) in the development of AI data-intensive systems [49.1574468325115]
We show that human rights already underpin the decisions in the field of data use.
This work presents a methodology and a model for a Human Rights Impact Assessment (HRIA)
The proposed methodology is tested in concrete case-studies to prove its feasibility and effectiveness.
arXiv Detail & Related papers (2024-07-30T16:27:52Z) - A Survey of Models for Cognitive Diagnosis: New Developments and Future Directions [66.40362209055023]
This paper aims to provide a survey of current models for cognitive diagnosis, with more attention on new developments using machine learning-based methods.
By comparing the model structures, parameter estimation algorithms, model evaluation methods and applications, we provide a relatively comprehensive review of the recent trends in cognitive diagnosis models.
arXiv Detail & Related papers (2024-07-07T18:02:00Z) - Interventional Imbalanced Multi-Modal Representation Learning via $β$-Generalization Front-Door Criterion [17.702549833449435]
Multi-modal methods establish comprehensive superiority over uni-modal methods.
In imbalanced contributions of different modalities to task-dependent predictions constantly degrade the discriminative performance of canonical multi-modal methods.
Benchmark methods raise a tractable solution: augmenting the auxiliary modality with a minor contribution during training.
arXiv Detail & Related papers (2024-06-17T12:55:56Z) - Visual Evaluative AI: A Hypothesis-Driven Tool with Concept-Based Explanations and Weight of Evidence [6.144558727925986]
This paper presents Visual Evaluative AI, a decision aid that provides positive and negative evidence from image data for a given hypothesis.<n>We apply and evaluate this tool in the skin cancer domain by building a web-based application that allows users to upload a dermatoscopic image.
arXiv Detail & Related papers (2024-05-13T12:09:01Z) - Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making [47.33241893184721]
In AI-assisted decision-making, humans often passively review AI's suggestion and decide whether to accept or reject it as a whole.<n>We propose Human-AI Deliberation, a novel framework to promote human reflection and discussion on conflicting human-AI opinions in decision-making.<n>Based on theories in human deliberation, this framework engages humans and AI in dimension-level opinion elicitation, deliberative discussion, and decision updates.
arXiv Detail & Related papers (2024-03-25T14:34:06Z) - Explaining by Imitating: Understanding Decisions by Interpretable Policy
Learning [72.80902932543474]
Understanding human behavior from observed data is critical for transparency and accountability in decision-making.
Consider real-world settings such as healthcare, in which modeling a decision-maker's policy is challenging.
We propose a data-driven representation of decision-making behavior that inheres transparency by design, accommodates partial observability, and operates completely offline.
arXiv Detail & Related papers (2023-10-28T13:06:14Z) - An Experimental Investigation into the Evaluation of Explainability
Methods [60.54170260771932]
This work compares 14 different metrics when applied to nine state-of-the-art XAI methods and three dummy methods (e.g., random saliency maps) used as references.
Experimental results show which of these metrics produces highly correlated results, indicating potential redundancy.
arXiv Detail & Related papers (2023-05-25T08:07:07Z) - A System's Approach Taxonomy for User-Centred XAI: A Survey [0.6882042556551609]
We propose a unified, inclusive and user-centred taxonomy for XAI based on the principles of General System's Theory.
This provides a basis for evaluating the appropriateness of XAI approaches for all user types, including both developers and end users.
arXiv Detail & Related papers (2023-03-06T00:50:23Z) - Improved Policy Evaluation for Randomized Trials of Algorithmic Resource
Allocation [54.72195809248172]
We present a new estimator leveraging our proposed novel concept, that involves retrospective reshuffling of participants across experimental arms at the end of an RCT.
We prove theoretically that such an estimator is more accurate than common estimators based on sample means.
arXiv Detail & Related papers (2023-02-06T05:17:22Z) - Excess risk analysis for epistemic uncertainty with application to
variational inference [110.4676591819618]
We present a novel EU analysis in the frequentist setting, where data is generated from an unknown distribution.
We show a relation between the generalization ability and the widely used EU measurements, such as the variance and entropy of the predictive distribution.
We propose new variational inference that directly controls the prediction and EU evaluation performances based on the PAC-Bayesian theory.
arXiv Detail & Related papers (2022-06-02T12:12:24Z) - On the Relationship Between Explanations, Fairness Perceptions, and
Decisions [2.5372245630249632]
It is known that recommendations of AI-based systems can be incorrect or unfair.
It is often proposed that a human be the final decision-maker.
Prior work has argued that explanations are an essential pathway to help human decision-makers enhance decision quality.
arXiv Detail & Related papers (2022-04-27T19:33:36Z) - Reinforcement Learning with Heterogeneous Data: Estimation and Inference [84.72174994749305]
We introduce the K-Heterogeneous Markov Decision Process (K-Hetero MDP) to address sequential decision problems with population heterogeneity.
We propose the Auto-Clustered Policy Evaluation (ACPE) for estimating the value of a given policy, and the Auto-Clustered Policy Iteration (ACPI) for estimating the optimal policy in a given policy class.
We present simulations to support our theoretical findings, and we conduct an empirical study on the standard MIMIC-III dataset.
arXiv Detail & Related papers (2022-01-31T20:58:47Z) - Learning the Truth From Only One Side of the Story [58.65439277460011]
We focus on generalized linear models and show that without adjusting for this sampling bias, the model may converge suboptimally or even fail to converge to the optimal solution.
We propose an adaptive approach that comes with theoretical guarantees and show that it outperforms several existing methods empirically.
arXiv Detail & Related papers (2020-06-08T18:20:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.