Teaming LLMs to Detect and Mitigate Hallucinations
- URL: http://arxiv.org/abs/2510.19507v2
- Date: Thu, 23 Oct 2025 15:12:10 GMT
- Title: Teaming LLMs to Detect and Mitigate Hallucinations
- Authors: Demian Till, John Smeaton, Peter Haubrick, Gouse Saheb, Florian Graef, David Berman,
- Abstract summary: We show that extending single-model consistency methods can result in substantial further improvements in hallucination detection and mitigation capabilities.<n>We evaluate this "consortium consistency" approach across many model teams from a pool of 15 model teams.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work has demonstrated state-of-the-art results in large language model (LLM) hallucination detection and mitigation through consistency-based approaches which involve aggregating multiple responses sampled from a single LLM for a given prompt. These approaches help offset limitations stemming from the imperfect data on which LLMs are trained, which includes biases and under-representation of information required at deployment time among other limitations which can lead to hallucinations. We show that extending these single-model consistency methods to combine responses from multiple LLMs with different training data, training schemes and model architectures can result in substantial further improvements in hallucination detection and mitigation capabilities beyond their single-model consistency counterparts. We evaluate this "consortium consistency" approach across many model teams from a pool of 15 LLMs and explore under what conditions it is beneficial to team together different LLMs in this manner. Further, we show that these performance improvements often come with reduced inference costs, offsetting a significant drawback with single-model consistency methods.
Related papers
- A Comprehensive Study on Visual Token Redundancy for Discrete Diffusion-based Multimodal Large Language Models [85.30893355216486]
We study how visual token redundancy evolves with different dMLLM architectures and tasks.<n>Our study reveals that visual redundancy emerges only in from-scratch dMLLMs while handling long-answer tasks.<n>Layer-skipping is promising for accelerating AR-to-diffusion dMLLMs, whereas progressive or late-step pruning is more effective for from-scratch dMLLMs.
arXiv Detail & Related papers (2025-11-19T04:13:36Z) - Reasoning with Preference Constraints: A Benchmark for Language Models in Many-to-One Matching Markets [13.111181135818184]
Large language models (LLMs) have shown strong performance on complex mathematical tasks, including optimization.<n>Applying LLMs to matching problems, which require reasoning under preferential and structural constraints, remains underexplored.<n>We employ a novel benchmark of 369 instances of the College Admission Problem to evaluate LLMs across key dimensions: feasibility, stability, and optimality.
arXiv Detail & Related papers (2025-09-16T14:48:46Z) - Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute [54.22256089592864]
This paper presents a simple, effective, and cost-efficient strategy to improve LLM performance by scaling test-time compute.<n>Our strategy builds upon the repeated-sampling-then-voting framework, with a novel twist: incorporating multiple models, even weaker ones, to leverage their complementary strengths.
arXiv Detail & Related papers (2025-04-01T13:13:43Z) - Uncertainty-Aware Fusion: An Ensemble Framework for Mitigating Hallucinations in Large Language Models [2.98260857963929]
Large Language Models (LLMs) are known to hallucinate and generate non-factual outputs which can undermine user trust.<n>Traditional methods to directly mitigate hallucinations, such as representation editing and contrastive decoding, often require additional training data and involve high implementation complexity.<n>We propose Uncertainty-Aware Fusion (UAF), an ensemble framework to reduce hallucinations by strategically combining multiple LLM based on their accuracy and self-assessment abilities.
arXiv Detail & Related papers (2025-02-22T10:48:18Z) - Preference Leakage: A Contamination Problem in LLM-as-a-judge [69.96778498636071]
Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods.<n>In this work, we expose preference leakage, a contamination problem in LLM-as-a-judge caused by the relatedness between the synthetic data generators and LLM-based evaluators.
arXiv Detail & Related papers (2025-02-03T17:13:03Z) - The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities [51.594836904623534]
We investigate whether instruction-tuned models possess fundamentally different capabilities from base models that are prompted using in-context examples.<n>We show that the performance of instruction-tuned models is significantly correlated with the in-context performance of their base counterparts.<n>Specifically, we extend this understanding to instruction-tuned models, suggesting that their pretraining data similarly sets a limiting boundary on the tasks they can solve.
arXiv Detail & Related papers (2025-01-15T10:57:55Z) - Enhancing Multi-Agent Consensus through Third-Party LLM Integration: Analyzing Uncertainty and Mitigating Hallucinations in Large Language Models [1.4582633500696451]
Large Language Models (LLMs) still face challenges when dealing with complex reasoning tasks.
This paper proposes a new method that integrates different LLMs to expand the knowledge boundary.
Experiments on arithmetic datasets have validated the effectiveness of the method.
arXiv Detail & Related papers (2024-11-25T08:42:33Z) - Beyond Binary: Towards Fine-Grained LLM-Generated Text Detection via Role Recognition and Involvement Measurement [51.601916604301685]
Large language models (LLMs) generate content that can undermine trust in online discourse.<n>Current methods often focus on binary classification, failing to address the complexities of real-world scenarios like human-LLM collaboration.<n>To move beyond binary classification and address these challenges, we propose a new paradigm for detecting LLM-generated content.
arXiv Detail & Related papers (2024-10-18T08:14:10Z) - CoMMIT: Coordinated Multimodal Instruction Tuning [90.1532838391285]
Multimodal large language models (MLLMs) generally involve cooperative learning between a backbone LLM and a feature encoder of non-text input modalities.<n>In this paper, we analyze the MLLM instruction tuning from both theoretical and empirical perspectives.<n>We propose a Multimodal Balance Coefficient that enables quantitative measurement of the balance of learning.
arXiv Detail & Related papers (2024-07-29T23:18:55Z) - OPDAI at SemEval-2024 Task 6: Small LLMs can Accelerate Hallucination
Detection with Weakly Supervised Data [1.3981625092173873]
This paper describes a unified system for hallucination detection of LLMs.
It wins the second prize in the model-agnostic track of the SemEval-2024 Task 6.
arXiv Detail & Related papers (2024-02-20T11:01:39Z) - Constituency Parsing using LLMs [47.17239291933248]
Constituency parsing is a fundamental yet unsolved challenge in natural language processing.<n>We evaluate the performance of recent large language models (LLMs) under zero-shot, few-shot, and supervised fine-tuning learning paradigms.<n>Motivated by this observation, we propose two strategies to guide LLMs to generate more accurate constituent trees.
arXiv Detail & Related papers (2023-10-30T11:39:11Z) - ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks [91.55895047448249]
This paper presents ReEval, an LLM-based framework using prompt chaining to perturb the original evidence for generating new test cases.
We implement ReEval using ChatGPT and evaluate the resulting variants of two popular open-domain QA datasets.
Our generated data is human-readable and useful to trigger hallucination in large language models.
arXiv Detail & Related papers (2023-10-19T06:37:32Z) - On Learning to Summarize with Large Language Models as References [101.79795027550959]
Large language models (LLMs) are favored by human annotators over the original reference summaries in commonly used summarization datasets.
We study an LLM-as-reference learning setting for smaller text summarization models to investigate whether their performance can be substantially improved.
arXiv Detail & Related papers (2023-05-23T16:56:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.