Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
- URL: http://arxiv.org/abs/2403.16527v1
- Date: Mon, 25 Mar 2024 08:11:02 GMT
- Title: Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
- Authors: Neeloy Chakraborty, Melkior Ornik, Katherine Driggs-Campbell,
- Abstract summary: We discuss the current use cases of foundation models for decision-making tasks.
We argue there is a need to step back and simultaneously design systems that can quantify the certainty of a model's decision.
- Score: 7.072820266877787
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Autonomous systems are soon to be ubiquitous, from manufacturing autonomy to agricultural field robots, and from health care assistants to the entertainment industry. The majority of these systems are developed with modular sub-components for decision-making, planning, and control that may be hand-engineered or learning-based. While these existing approaches have been shown to perform well under the situations they were specifically designed for, they can perform especially poorly in rare, out-of-distribution scenarios that will undoubtedly arise at test-time. The rise of foundation models trained on multiple tasks with impressively large datasets from a variety of fields has led researchers to believe that these models may provide common sense reasoning that existing planners are missing. Researchers posit that this common sense reasoning will bridge the gap between algorithm development and deployment to out-of-distribution tasks, like how humans adapt to unexpected scenarios. Large language models have already penetrated the robotics and autonomous systems domains as researchers are scrambling to showcase their potential use cases in deployment. While this application direction is very promising empirically, foundation models are known to hallucinate and generate decisions that may sound reasonable, but are in fact poor. We argue there is a need to step back and simultaneously design systems that can quantify the certainty of a model's decision, and detect when it may be hallucinating. In this work, we discuss the current use cases of foundation models for decision-making tasks, provide a general definition for hallucinations with examples, discuss existing approaches to hallucination detection and mitigation with a focus on decision problems, and explore areas for further research in this exciting field.
Related papers
- Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - Inverse decision-making using neural amortized Bayesian actors [19.128377007314317]
We amortize the Bayesian actor using a neural network trained on a wide range of different parameter settings in an unsupervised fashion.
We show that the inferred posterior distributions are in close alignment with those obtained using analytical solutions where they exist.
We then show that identifiability problems between priors and costs can arise in more complex cost functions.
arXiv Detail & Related papers (2024-09-04T10:31:35Z) - Foundation Models for Autonomous Robots in Unstructured Environments [15.517532442044962]
The study systematically reviews application of foundation models in two field of robotic and unstructured environment.
Findings showed that linguistic capabilities of LLMs have been utilized more than other features for improving perception in human-robot interactions.
The use of LLMs demonstrated more applications in project management and safety in construction, and natural hazard detection in disaster management.
arXiv Detail & Related papers (2024-07-19T13:26:52Z) - A Reliable Framework for Human-in-the-Loop Anomaly Detection in Time Series [17.08674819906415]
We introduce HILAD, a novel framework designed to foster a dynamic and bidirectional collaboration between humans and AI.
Through our visual interface, HILAD empowers domain experts to detect, interpret, and correct unexpected model behaviors at scale.
arXiv Detail & Related papers (2024-05-06T07:44:07Z) - Multi-Agent Verification and Control with Probabilistic Model Checking [4.56877715768796]
Probabilistic model checking is a technique for formal automated reasoning about software or hardware systems.
It builds upon ideas and techniques from a diverse range of fields, from logic, automata and graph theory, to optimisation, numerical methods and control.
In recent years, probabilistic model checking has also been extended to integrate ideas from game theory.
arXiv Detail & Related papers (2023-08-05T09:31:32Z) - Robots That Ask For Help: Uncertainty Alignment for Large Language Model
Planners [85.03486419424647]
KnowNo is a framework for measuring and aligning the uncertainty of large language models.
KnowNo builds on the theory of conformal prediction to provide statistical guarantees on task completion.
arXiv Detail & Related papers (2023-07-04T21:25:12Z) - Foundation Models for Decision Making: Problems, Methods, and
Opportunities [124.79381732197649]
Foundation models pretrained on diverse data at scale have demonstrated extraordinary capabilities in a wide range of vision and language tasks.
New paradigms are emerging for training foundation models to interact with other agents and perform long-term reasoning.
Research at the intersection of foundation models and decision making holds tremendous promise for creating powerful new systems.
arXiv Detail & Related papers (2023-03-07T18:44:07Z) - Gradient Optimization for Single-State RMDPs [0.0]
Modern problems such as autonomous driving, control of robotic components, and medical diagnostics have become increasingly difficult to solve analytically.
Data-driven solutions are a strong option where there are problems with more dimensions of complexity than can be understood by people.
Unfortunately, data-driven models often come with uncertainty in how they will perform in the worst of scenarios.
In fields such as autonomous driving and medicine, the consequences of these failures could be catastrophic.
arXiv Detail & Related papers (2022-09-25T18:50:02Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - CausalCity: Complex Simulations with Agency for Causal Discovery and
Reasoning [68.74447489372037]
We present a high-fidelity simulation environment that is designed for developing algorithms for causal discovery and counterfactual reasoning.
A core component of our work is to introduce textitagency, such that it is simple to define and create complex scenarios.
We perform experiments with three state-of-the-art methods to create baselines and highlight the affordances of this environment.
arXiv Detail & Related papers (2021-06-25T00:21:41Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.