Taming Object Hallucinations with Verified Atomic Confidence Estimation
- URL: http://arxiv.org/abs/2511.09228v1
- Date: Thu, 13 Nov 2025 01:41:44 GMT
- Title: Taming Object Hallucinations with Verified Atomic Confidence Estimation
- Authors: Jiarui Liu, Weihao Xuan, Zhijing Jin, Mona Diab,
- Abstract summary: We introduce TACO, a framework that mitigates hallucinations through self-verification and confidence calibration without relying on external vision experts.<n>TACO decomposes responses into atomic queries, paraphrases them to reduce sensitivity to wording, and estimates confidence using self-consistency (black-box) or self-confidence (gray-box) aggregation.
- Score: 18.54793854791167
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multimodal Large Language Models (MLLMs) often suffer from hallucinations, particularly errors in object existence, attributes, or relations, which undermine their reliability. We introduce TACO (Verified Atomic Confidence Estimation), a simple framework that mitigates hallucinations through self-verification and confidence calibration without relying on external vision experts. TACO decomposes responses into atomic queries, paraphrases them to reduce sensitivity to wording, and estimates confidence using self-consistency (black-box) or self-confidence (gray-box) aggregation, before refining answers with a language model. Experiments on five benchmarks (POPE, MME, HallusionBench, AMBER, and MM-Hal Bench) with two MLLMs (\texttt{LLaVA-1.5-7B} and \texttt{CogVLM2}) show that TACO consistently outperforms direct prompting and Visual Contrastive Decoding, reduces systematic biases, and improves confidence calibration, demonstrating its effectiveness in enhancing the faithfulness of MLLMs.
Related papers
- BrowseConf: Confidence-Guided Test-Time Scaling for Web Agents [58.05949210993854]
We investigate whether search agents have the ability to communicate their own confidence through verbalized confidence scores after long sequences of actions.<n>We propose Test-Time Scaling (TTS) methods that use confidence scores to determine answer quality, encourage the model to try again until reaching a satisfactory confidence level.
arXiv Detail & Related papers (2025-10-27T15:58:51Z) - Calibrating Verbalized Confidence with Self-Generated Distractors [24.56911906044891]
We introduce Distractor-Normalized Coherence (DINCO)<n>DINCO estimates and accounts for an LLM's suggestibility bias by having the model its confidence independently across several self-generated distractors.<n>We frame the popular approach of self-consistency as leveraging coherence across sampled generations, and normalized verbalized confidence as leveraging coherence across validations on incompatible claims.
arXiv Detail & Related papers (2025-09-29T21:41:22Z) - Self-Consistency as a Free Lunch: Reducing Hallucinations in Vision-Language Models via Self-Reflection [71.8243083897721]
Vision-language models often hallucinate details, generating non-existent objects or inaccurate attributes that compromise output reliability.<n>We present a novel framework that leverages the model's self-consistency between long responses and short answers to generate preference pairs for training.
arXiv Detail & Related papers (2025-09-27T10:37:11Z) - ConfTuner: Training Large Language Models to Express Their Confidence Verbally [58.63318088243125]
Large Language Models (LLMs) are increasingly deployed in high-stakes domains such as science, law, and healthcare.<n>LLMs are often observed to generate incorrect answers with high confidence, a phenomenon known as "overconfidence"
arXiv Detail & Related papers (2025-08-26T09:25:32Z) - How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models [28.62988505317048]
Large language models (LLMs) exhibit strikingly conflicting behaviors.<n>LLMs can appear steadfastly overconfident in their initial answers whilst being prone to excessive doubt when challenged.<n>We show that LLMs exhibit a pronounced choice-supportive bias that reinforces and boosts their estimate of confidence in their answer.
arXiv Detail & Related papers (2025-07-03T18:57:43Z) - Verbalized Confidence Triggers Self-Verification: Emergent Behavior Without Explicit Reasoning Supervision [12.287123198288079]
Uncertainty calibration is essential for the safe deployment of large language models (LLMs)<n>We find that supervised fine-tuning with scalar confidence labels alone suffices to elicit self-verification behavior of language models.<n>We propose a simple rethinking method that boosts performance via test-time scaling based on calibrated uncertainty.
arXiv Detail & Related papers (2025-06-04T08:56:24Z) - MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs [66.14178164421794]
We introduce MetaFaith, a novel prompt-based calibration approach inspired by human metacognition.<n>We show that MetaFaith robustly improves faithful calibration across diverse models and task domains, enabling up to 61% improvement in faithfulness.
arXiv Detail & Related papers (2025-05-30T17:54:08Z) - Improving the Reliability of LLMs: Combining CoT, RAG, Self-Consistency, and Self-Verification [1.5095869543963976]
Large language models (LLMs) generate confident but incorrect or irrelevant information.<n>Hallucination is a key limitation in their application to complex, open-ended tasks.<n>We investigate how combining Chain-of-thought (CoT) with retrieval-augmented generation (RAG) can reduce hallucinations.
arXiv Detail & Related papers (2025-05-13T23:57:02Z) - Aligning Large Language Models for Faithful Integrity Against Opposing Argument [71.33552795870544]
Large Language Models (LLMs) have demonstrated impressive capabilities in complex reasoning tasks.<n>They can be easily misled by unfaithful arguments during conversations, even when their original statements are correct.<n>We propose a novel framework, named Alignment for Faithful Integrity with Confidence Estimation.
arXiv Detail & Related papers (2025-01-02T16:38:21Z) - On Verbalized Confidence Scores for LLMs [25.160810008907397]
Uncertainty quantification for large language models (LLMs) can establish more human trust into their responses.<n>This work focuses on asking the LLM itself to verbalize its uncertainty with a confidence score as part of its output tokens.<n>We assess the reliability of verbalized confidence scores with respect to different datasets, models, and prompt methods.
arXiv Detail & Related papers (2024-12-19T11:10:36Z) - Fact-Level Confidence Calibration and Self-Correction [64.40105513819272]
We propose a Fact-Level framework that calibrates confidence to relevance-weighted correctness at the fact level.
We also develop Confidence-Guided Fact-level Self-Correction ($textbfConFix$), which uses high-confidence facts within a response as additional knowledge to improve low-confidence ones.
arXiv Detail & Related papers (2024-11-20T14:15:18Z) - Mirror-Consistency: Harnessing Inconsistency in Majority Voting [37.707204723835765]
We present Mirror-Consistency, an enhancement of the standard Self-Consistency approach.<n>Mirror-Consistency incorporates a'reflective mirror' into the self-ensemble decoding process.<n>We show that Mirror-Consistency yields superior performance in both reasoning accuracy and confidence calibration compared to Self-Consistency.
arXiv Detail & Related papers (2024-10-07T03:41:08Z) - Think Twice Before Trusting: Self-Detection for Large Language Models through Comprehensive Answer Reflection [90.71323430635593]
We propose a novel self-detection paradigm that considers the comprehensive answer space beyond LLM-generated answers.
Building upon this paradigm, we introduce a two-step framework, which firstly instructs LLM to reflect and provide justifications for each candidate answer.
This framework can be seamlessly integrated with existing approaches for superior self-detection.
arXiv Detail & Related papers (2024-03-15T02:38:26Z) - INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection [39.52923659121416]
We propose to explore the dense semantic information retained within textbfINternal textbfStates for halluctextbfInation textbfDEtection.
A simple yet effective textbfEigenScore metric is proposed to better evaluate responses' self-consistency.
A test time feature clipping approach is explored to truncate extreme activations in the internal states.
arXiv Detail & Related papers (2024-02-06T06:23:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.