Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
- URL: http://arxiv.org/abs/2403.16527v2
- Date: Tue, 11 Feb 2025 17:40:41 GMT
- Title: Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
- Authors: Neeloy Chakraborty, Melkior Ornik, Katherine Driggs-Campbell,
- Abstract summary: We discuss the current use cases of foundation models for decision-making tasks.
We argue there is a need to step back and simultaneously design systems that can quantify the certainty of a model's decision.
- Score: 7.072820266877787
- License:
- Abstract: Autonomous systems are soon to be ubiquitous, spanning manufacturing, agriculture, healthcare, entertainment, and other industries. Most of these systems are developed with modular sub-components for decision-making, planning, and control that may be hand-engineered or learning-based. While these approaches perform well under the situations they were specifically designed for, they can perform especially poorly in out-of-distribution scenarios that will undoubtedly arise at test-time. The rise of foundation models trained on multiple tasks with impressively large datasets has led researchers to believe that these models may provide "common sense" reasoning that existing planners are missing, bridging the gap between algorithm development and deployment. While researchers have shown promising results in deploying foundation models to decision-making tasks, these models are known to hallucinate and generate decisions that may sound reasonable, but are in fact poor. We argue there is a need to step back and simultaneously design systems that can quantify the certainty of a model's decision, and detect when it may be hallucinating. In this work, we discuss the current use cases of foundation models for decision-making tasks, provide a general definition for hallucinations with examples, discuss existing approaches to hallucination detection and mitigation with a focus on decision problems, present guidelines, and explore areas for further research in this exciting field.
Related papers
- Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage.
Models may behave unreliably due to poorly explored failure modes.
causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z) - Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - Inverse decision-making using neural amortized Bayesian actors [19.128377007314317]
We amortize the Bayesian actor using a neural network trained on a wide range of parameter settings in an unsupervised fashion.
We show how our method allows for principled model comparison and how it can be used to disentangle factors that may lead to unidentifiabilities between priors and costs.
arXiv Detail & Related papers (2024-09-04T10:31:35Z) - Explaining Relation Classification Models with Semantic Extents [1.7604348079019634]
A lack of explainability is currently a complicating factor in many real-world applications.
We introduce semantic extents, a concept to analyze decision patterns for the relation classification task.
We provide an annotation tool and a software framework to determine semantic extents for humans and models.
arXiv Detail & Related papers (2023-08-04T08:17:52Z) - Foundation Models for Decision Making: Problems, Methods, and
Opportunities [124.79381732197649]
Foundation models pretrained on diverse data at scale have demonstrated extraordinary capabilities in a wide range of vision and language tasks.
New paradigms are emerging for training foundation models to interact with other agents and perform long-term reasoning.
Research at the intersection of foundation models and decision making holds tremendous promise for creating powerful new systems.
arXiv Detail & Related papers (2023-03-07T18:44:07Z) - Gradient Optimization for Single-State RMDPs [0.0]
Modern problems such as autonomous driving, control of robotic components, and medical diagnostics have become increasingly difficult to solve analytically.
Data-driven solutions are a strong option where there are problems with more dimensions of complexity than can be understood by people.
Unfortunately, data-driven models often come with uncertainty in how they will perform in the worst of scenarios.
In fields such as autonomous driving and medicine, the consequences of these failures could be catastrophic.
arXiv Detail & Related papers (2022-09-25T18:50:02Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - Learning-Driven Decision Mechanisms in Physical Layer: Facts,
Challenges, and Remedies [23.446736654473753]
This paper introduces the common assumptions in the physical layer to highlight their discrepancies with practical systems.
As a solution, learning algorithms are examined by considering implementation steps and challenges.
arXiv Detail & Related papers (2021-02-14T22:26:44Z) - Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap.
We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert.
Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.