Related papers: SelfCheckAgent: Zero-Resource Hallucination Detection in Generative Large Language Models

Related papers

AgentHallu: Benchmarking Automated Hallucination Attribution of LLM-based Agents [30.66751974860931]
hallucination detection in single-turn responses requires identifying which step causes the initial divergence.<n>We propose a new research task, automated hallucination attribution of LLM-based agents, aiming to identify the step responsible for the hallucination and explain why.<n>We introduce AgentHallu, a comprehensive benchmark with 693 high-quality trajectories spanning 7 agent frameworks and 5 domains.<n>The best-performing model achieves only 41.1% step localization accuracy, where tool-use hallucinations are the most challenging at just 11.6%.
arXiv Detail & Related papers (2026-01-11T09:04:26Z)
DeepAgent: A Dual Stream Multi Agent Fusion for Robust Multimodal Deepfake Detection [1.7024685699333262]
DeepAgent is a framework that simultaneously incorporates both visual and audio modalities for the effective detection of deepfakes.<n>Agent-1 examines each video with a streamlined AlexNet-based CNN to identify the symbols of deepfake manipulation.<n>Agent-2 detects audio-visual inconsistencies by combining acoustic features, audio transcriptions from Whisper, and frame-reading sequences of images through EasyOCR.
arXiv Detail & Related papers (2025-12-08T09:43:30Z)
MIRAGE: Agentic Framework for Multimodal Misinformation Detection with Web-Grounded Reasoning [0.6475163438744868]
We present MIRAGE, an inference-time, model-pluggable agentic framework that decomposes multimodal verification into four sequential modules.<n> visual veracity assessment detects AI-generated images, cross-modal consistency analysis identifies out-of-context repurposing, retrieval-augmented factual checking grounds claims in web evidence.<n>MIRAGE orchestrates vision-language model reasoning with targeted web retrieval, outputs structured and citation-linked rationales.
arXiv Detail & Related papers (2025-10-20T14:40:26Z)
LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions [80.12078194093013]
We present the first comprehensive survey of hallucinations in LLM-based agents.<n>We propose a new taxonomy that identifies different types of agent hallucinations occurring at different stages.<n>We conduct an in-depth examination of eighteen triggering causes underlying the emergence of agent hallucinations.
arXiv Detail & Related papers (2025-09-23T13:24:48Z)
VulAgent: Hypothesis-Validation based Multi-Agent Vulnerability Detection [55.957275374847484]
VulAgent is a multi-agent vulnerability detection framework based on hypothesis validation.<n>It implements a semantics-sensitive, multi-view detection pipeline, each aligned to a specific analysis perspective.<n>On average, VulAgent improves overall accuracy by 6.6%, increases the correct identification rate of vulnerable--fixed code pairs by up to 450%, and reduces the false positive rate by about 36%.
arXiv Detail & Related papers (2025-09-15T02:25:38Z)
MCP-Orchestrated Multi-Agent System for Automated Disinformation Detection [84.75972919995398]
This paper presents a multi-agent system that uses relation extraction to detect disinformation in news articles.<n>The proposed Agentic AI system combines four agents: (i) a machine learning agent (logistic regression), (ii) a Wikipedia knowledge check agent, and (iv) a web-scraped data analyzer.<n>Results demonstrate that the multi-agent ensemble achieves 95.3% accuracy with an F1 score of 0.964, significantly outperforming individual agents and traditional approaches.
arXiv Detail & Related papers (2025-08-13T19:14:48Z)
ICR Probe: Tracking Hidden State Dynamics for Reliable Hallucination Detection in LLMs [50.18087419133284]
hallucination detection methods leveraging hidden states predominantly focus on static and isolated representations.<n>We introduce a novel metric, the ICR Score, which quantifies the contribution of modules to the hidden states' update.<n>We propose a hallucination detection method, the ICR Probe, which captures the cross-layer evolution of hidden states.
arXiv Detail & Related papers (2025-07-22T11:44:26Z)
Learning Auxiliary Tasks Improves Reference-Free Hallucination Detection in Open-Domain Long-Form Generation [78.78421340836915]
We systematically investigate reference-free hallucination detection in open-domain long-form responses.<n>Our findings reveal that internal states are insufficient for reliably distinguishing between factual and hallucinated content.<n>We introduce a new paradigm, named RATE-FT, that augments fine-tuning with an auxiliary task for the model to jointly learn with the main task of hallucination detection.
arXiv Detail & Related papers (2025-05-18T07:10:03Z)
SINdex: Semantic INconsistency Index for Hallucination Detection in LLMs [2.805517909463769]
Large language models (LLMs) are increasingly deployed across diverse domains, yet they are prone to generating factually incorrect outputs. We introduce a novel and scalable uncertainty-based semantic clustering framework for automated hallucination detection.
arXiv Detail & Related papers (2025-03-07T23:25:19Z)
SAFE: A Sparse Autoencoder-Based Framework for Robust Query Enrichment and Hallucination Mitigation in LLMs [12.12344696058016]
We propose SAFE, a novel method for detecting and mitigating hallucinations by leveraging Sparse Autoencoders (SAEs) Empirical results demonstrate that SAFE consistently improves query generation accuracy and mitigates hallucinations across all datasets, achieving accuracy improvements of up to 29.45%.
arXiv Detail & Related papers (2025-03-04T22:19:52Z)
Good Parenting is all you need -- Multi-agentic LLM Hallucination Mitigation [0.0]
This study explores the ability of Large Language Model (LLM) agents to detect and correct hallucinations in AI-generated content. Across 4,900 test runs involving various combinations of primary and reviewing agents, advanced AI models demonstrated near-perfect accuracy in identifying hallucinations.
arXiv Detail & Related papers (2024-10-18T08:18:18Z)
Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation [96.78845113346809]
Retrieval-augmented language models (RALMs) have shown strong performance and wide applicability in knowledge-intensive tasks. This paper proposes SynCheck, a lightweight monitor that leverages fine-grained decoding dynamics to detect unfaithful sentences. We also introduce FOD, a faithfulness-oriented decoding algorithm guided by beam search for long-form retrieval-augmented generation.
arXiv Detail & Related papers (2024-06-19T16:42:57Z)
Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector [114.88975874411142]
Hallucination detection is a challenging task for large language models (LLMs) We propose an autonomous LLM-based agent framework, called HaluAgent. In HaluAgent, we integrate the LLM, multi-functional toolbox, and design a fine-grained three-stage detection framework.
arXiv Detail & Related papers (2024-06-17T07:30:05Z)
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models [56.00992369295851]
Open-sourced Large Language Models (LLMs) have achieved great success in various NLP tasks, however, they are still far inferior to API-based models when acting as agents. This paper delivers three key observations: (1) the current agent training corpus is entangled with both formats following and agent reasoning, which significantly shifts from the distribution of its pre-training data; (2) LLMs exhibit different learning speeds on the capabilities required by agent tasks; and (3) current approaches have side-effects when improving agent abilities by introducing hallucinations. We propose Agent-FLAN to effectively Fine-tune LANguage models for Agents.
arXiv Detail & Related papers (2024-03-19T16:26:10Z)
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs [0.0]
Hallucinations pose a significant challenge to the reliability and alignment of Large Language Models (LLMs) This paper introduces an automated scalable framework that combines benchmarking LLMs' hallucination tendencies with efficient hallucination detection. The framework is domain-agnostic, allowing the use of any language model for benchmark creation or evaluation in any domain.
arXiv Detail & Related papers (2024-02-25T22:23:37Z)
"Knowing When You Don't Know": A Multilingual Relevance Assessment Dataset for Robust Retrieval-Augmented Generation [90.09260023184932]
Retrieval-Augmented Generation (RAG) grounds Large Language Model (LLM) output by leveraging external knowledge sources to reduce factual hallucinations. NoMIRACL is a human-annotated dataset for evaluating LLM robustness in RAG across 18 typologically diverse languages. We measure relevance assessment using: (i) hallucination rate, measuring model tendency to hallucinate, when the answer is not present in passages in the non-relevant subset, and (ii) error rate, measuring model inaccuracy to recognize relevant passages in the relevant subset.
arXiv Detail & Related papers (2023-12-18T17:18:04Z)
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus [99.33091772494751]
Large Language Models (LLMs) have gained significant popularity for their impressive performance across diverse fields. LLMs are prone to hallucinate untruthful or nonsensical outputs that fail to meet user expectations. We propose a novel reference-free, uncertainty-based method for detecting hallucinations in LLMs.
arXiv Detail & Related papers (2023-11-22T08:39:17Z)
Weakly Supervised Veracity Classification with LLM-Predicted Credibility Signals [4.895830603263421]
Pastel is a weakly supervised approach that leverages large language models to extract credibility signals from web content. We study the association between credibility signals and veracity, and perform a study showing the impact of each signal on model performance.
arXiv Detail & Related papers (2023-09-14T11:06:51Z)
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation [76.34411067299331]
Large language models often tend to 'hallucinate' which critically hampers their reliability. We propose an approach that actively detects and mitigates hallucinations during the generation process. We show that the proposed active detection and mitigation approach successfully reduces the hallucinations of the GPT-3.5 model from 47.5% to 14.5% on average.
arXiv Detail & Related papers (2023-07-08T14:25:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.