HIME: Mitigating Object Hallucinations in LVLMs via Hallucination Insensitivity Model Editing
- URL: http://arxiv.org/abs/2602.18711v1
- Date: Sat, 21 Feb 2026 04:16:17 GMT
- Title: HIME: Mitigating Object Hallucinations in LVLMs via Hallucination Insensitivity Model Editing
- Authors: Ahmed Akl, Abdelwahed Khamis, Ali Cheraghian, Zhe Wang, Sara Khalifa, Kewen Wang,
- Abstract summary: Large Vision-Language Models (LVLMs) have demonstrated impressive multimodal understanding capabilities.<n>LVLMs are prone to object hallucination, where models describe non-existent objects or attribute incorrect factual information.<n>We propose Hallucination Insensitivity Model Editing (HIME), a layer-adaptive weight editing approach that selectively modifies latent features to suppress hallucinations.
- Score: 6.021803204524807
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Vision-Language Models (LVLMs) have demonstrated impressive multimodal understanding capabilities, yet they remain prone to object hallucination, where models describe non-existent objects or attribute incorrect factual information, raising serious concerns for reliable real-world deployment. While fine-tuning is a commonly adopted mitigation strategy, its high computational cost and practical difficulty motivate the need for training-free alternatives, among which model editing has recently emerged as a promising direction. However, indiscriminate editing risks disrupting the rich implicit knowledge encoded in pre-trained LVLMs, leading to a fundamental question: how much intervention is necessary at each layer to suppress hallucinations while preserving pre-trained knowledge? To address this question, we present a systematic analysis of LVLM decoders built on three widely used large language model backbones-Qwen, LLaMA, and Vicuna-revealing clear layer-wise differences in susceptibility to object hallucination. Building on these insights, we introduce the Hallucination Insensitivity Score (HIS), a principled metric that quantifies each layer's sensitivity to hallucination and provides guidance for targeted intervention. Leveraging HIS, we propose Hallucination Insensitivity Model Editing (HIME), a simple yet effective layer-adaptive weight editing approach that selectively modifies latent features to suppress hallucinations while preserving pre-trained knowledge. Extensive experiments demonstrate that HIME reduces hallucinations by an average of 61.8% across open-ended generation benchmarks, including CHAIR, MME, and GPT-4V-aided evaluation, without introducing additional parameters, inference-time latency, or computational overhead.
Related papers
- Mitigating Hallucinations in Large Vision-Language Models by Self-Injecting Hallucinations [73.37711261605271]
hallucination mitigation methods are mainly based on preference alignment and require external human annotations or auxiliary models for preference data collection.<n>We propose Autonomous Preference Alignment via Self-Injection (APASI), a novel and generalizable method that mitigates hallucinations without external dependencies.<n>APASI leverages the target LVLM to self-inject hallucinations into a generated response, creating a pair of responses with varying preference levels.
arXiv Detail & Related papers (2025-09-14T14:26:53Z) - SHALE: A Scalable Benchmark for Fine-grained Hallucination Evaluation in LVLMs [52.03164192840023]
Large Vision-Language Models (LVLMs) still suffer from hallucinations, i.e., generating content inconsistent with input or established world knowledge.<n>We propose an automated data construction pipeline that produces scalable, controllable, and diverse evaluation data.<n>We construct SHALE, a benchmark designed to assess both faithfulness and factuality hallucinations.
arXiv Detail & Related papers (2025-08-13T07:58:01Z) - Mitigating Hallucination in VideoLLMs via Temporal-Aware Activation Engineering [83.63437999696954]
hallucination in large language models (MLLMs) persists as a significant and under-addressed challenge in the video domain.<n>We propose a temporal-aware activation engineering framework for VideoLLMs, which adaptively identifies and manipulates hallucination-sensitive modules.
arXiv Detail & Related papers (2025-05-19T08:12:06Z) - Mitigating Hallucinations via Inter-Layer Consistency Aggregation in Large Vision-Language Models [3.9464481148889354]
We propose a novel decoding mechanism, Decoding with Inter-layer Consistency via Layer Aggregation (DCLA)<n>Our approach constructs a dynamic semantic reference by aggregating representations from previous layers, and corrects semantically deviated layers to enforce inter-layer consistency.<n> Experiments on hallucination benchmarks such as MME and POPE demonstrate that DCLA effectively reduces hallucinations while enhancing the reliability and performance of LVLMs.
arXiv Detail & Related papers (2025-05-18T10:15:42Z) - HalluLens: LLM Hallucination Benchmark [49.170128733508335]
Large language models (LLMs) often generate responses that deviate from user input or training data, a phenomenon known as "hallucination"<n>This paper introduces a comprehensive hallucination benchmark, incorporating both new extrinsic and existing intrinsic evaluation tasks.
arXiv Detail & Related papers (2025-04-24T13:40:27Z) - Efficient Contrastive Decoding with Probabilistic Hallucination Detection - Mitigating Hallucinations in Large Vision Language Models - [1.2499537119440245]
Efficient Contrastive Decoding (ECD) is a simple method that leverages probabilistic hallucination detection to shift the output distribution towards contextually accurate answers at inference time.<n>Our experiments show that ECD effectively mitigates hallucinations, outperforming state-of-the-art methods with respect to performance on LVLM benchmarks and computation time.
arXiv Detail & Related papers (2025-04-16T14:50:25Z) - Mitigating Hallucinations in Large Vision-Language Models with Internal Fact-based Contrastive Decoding [5.424048651554831]
Internal Fact-based Contrastive Decoding (IFCD) is designed to mitigate and suppress hallucinations during the inference process of Large Visual Language Models (LVLMs)<n>IFCD calibrates the LVLMs' output and effectively removes the hallucinatory logits from the final predictions.<n> Experimental results validate that IFCD significantly alleviates both object-level and attribute-level hallucinations while achieving an average 9% accuracy improvement on POPE and 8% accuracy improvement on MME object hallucinations subset compared with direct decoding, respectively.
arXiv Detail & Related papers (2025-02-03T05:08:35Z) - Iter-AHMCL: Alleviate Hallucination for Large Language Model via Iterative Model-level Contrastive Learning [16.883679810267342]
Iterative Model-level Contrastive Learning (Iter-AHMCL) to address hallucination.
This paper introduces a novel approach called Iterative Model-level Contrastive Learning (Iter-AHMCL) to address hallucination.
arXiv Detail & Related papers (2024-10-16T00:15:40Z) - Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization [123.54980913741828]
Large Visual Language Models (LVLMs) have demonstrated exceptional abilities in understanding multimodal data.<n>They invariably suffer from hallucinations, leading to a disconnect between the generated text and the corresponding images.<n>Almost all current visual contrastive decoding methods attempt to mitigate these hallucinations by introducing visual uncertainty information.<n>However, they struggle to precisely induce the hallucinatory tokens, which severely limits their effectiveness in mitigating hallucinations.
arXiv Detail & Related papers (2024-05-24T08:46:31Z) - AutoHall: Automated Hallucination Dataset Generation for Large Language Models [56.92068213969036]
This paper introduces a method for automatically constructing model-specific hallucination datasets based on existing fact-checking datasets called AutoHall.
We also propose a zero-resource and black-box hallucination detection method based on self-contradiction.
arXiv Detail & Related papers (2023-09-30T05:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.