Related papers: SLPL SHROOM at SemEval2024 Task 06: A comprehensive study on models ability to detect hallucination

Related papers

A Survey of Multimodal Hallucination Evaluation and Detection [52.03164192840023]
Multi-modal Large Language Models (MLLMs) have emerged as a powerful paradigm for integrating visual and textual information.<n>These models often suffer from hallucination, producing content that appears plausible but contradicts the input content or established world knowledge.<n>This survey offers an in-depth review of hallucination evaluation benchmarks and detection methods across Image-to-Text (I2T) and Text-to-image (T2I) generation tasks.
arXiv Detail & Related papers (2025-07-25T07:22:42Z)
Mitigating Object Hallucinations via Sentence-Level Early Intervention [10.642552315531404]
Multimodal large language models (MLLMs) have revolutionized cross-modal understanding but continue to struggle with hallucinations.<n>We propose SENTINEL, a framework that eliminates dependency on human annotations.<n>Sentence-level Early iNtervention Through IN-domain prEference Learning can reduce hallucinations by over 90% compared to the original model.
arXiv Detail & Related papers (2025-07-16T17:55:43Z)
keepitsimple at SemEval-2025 Task 3: LLM-Uncertainty based Approach for Multilingual Hallucination Span Detection [0.0]
Identification of hallucination spans in black-box language model generated text is essential for applications in the real world.<n>We present our solution to this problem, which capitalizes on the variability ofally-sampled responses in order to identify hallucinated spans.<n>We measure this divergence through entropy-based analysis, allowing for accurate identification of hallucinated segments.
arXiv Detail & Related papers (2025-05-23T05:25:14Z)
Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling [67.14942827452161]
Vision-Language Models (VLMs) excel at visual understanding but often suffer from visual hallucinations. In this work, we introduce REVERSE, a unified framework that integrates hallucination-aware training with on-the-fly self-verification.
arXiv Detail & Related papers (2025-04-17T17:59:22Z)
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations [82.42811602081692]
This paper introduces a subsequence association framework to systematically trace and understand hallucinations. Key insight is hallucinations that arise when dominant hallucinatory associations outweigh faithful ones. We propose a tracing algorithm that identifies causal subsequences by analyzing hallucination probabilities across randomized input contexts.
arXiv Detail & Related papers (2025-04-17T06:34:45Z)
HausaNLP at SemEval-2025 Task 3: Towards a Fine-Grained Model-Aware Hallucination Detection [1.8230982862848586]
We aim to provide a nuanced, model-aware understanding of hallucination occurrences and severity in English. We used natural language inference and fine-tuned a ModernBERT model using a synthetic dataset of 400 samples. Results indicate a moderately positive correlation between the model's confidence scores and the actual presence of hallucinations.
arXiv Detail & Related papers (2025-03-25T13:40:22Z)
KSHSeek: Data-Driven Approaches to Mitigating and Detecting Knowledge-Shortcut Hallucinations in Generative Models [17.435794516702256]
Large language models (LLMs) have significantly advanced the development of natural language processing (NLP) Model hallucinations remain a major challenge in natural language generation (NLG) tasks due to their complex causes. This work introduces a new paradigm for mitigating specific hallucination issues in generative models, enhancing their robustness and reliability in real-world applications.
arXiv Detail & Related papers (2025-03-25T09:18:27Z)
From Hallucinations to Facts: Enhancing Language Models with Curated Knowledge Graphs [20.438680406650967]
This paper addresses language model hallucination by integrating curated knowledge graph (KG) triples to anchor responses in empirical data. We aim to generate both linguistically fluent responses and deeply rooted in factual accuracy and context relevance.
arXiv Detail & Related papers (2024-12-24T20:16:10Z)
H-POPE: Hierarchical Polling-based Probing Evaluation of Hallucinations in Large Vision-Language Models [0.0]
We propose H-POPE, a coarse-to-fine-grained benchmark that assesses hallucinations in object existence and attributes. Our evaluation shows that models are prone to hallucinations on object existence, and even more so on fine-grained attributes.
arXiv Detail & Related papers (2024-11-06T17:55:37Z)
A Debate-Driven Experiment on LLM Hallucinations and Accuracy [7.821303946741665]
This study investigates the phenomenon of hallucination in large language models (LLMs) Multiple instances of GPT-4o-Mini models engage in a debate-like interaction prompted with questions from the TruthfulQA dataset. One model is deliberately instructed to generate plausible but false answers while the other models are asked to respond truthfully.
arXiv Detail & Related papers (2024-10-25T11:41:27Z)
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps [48.58310785625051]
Large language models (LLMs) can hallucinate details and respond with unsubstantiated answers. This paper describes a simple approach for detecting such contextual hallucinations.
arXiv Detail & Related papers (2024-07-09T17:44:34Z)
Prescribing the Right Remedy: Mitigating Hallucinations in Large Vision-Language Models via Targeted Instruction Tuning [15.156359255401812]
We propose a targeted instruction data generation framework named DFTG that tailored to the hallucination specificity of different models. The experimental results on hallucination benchmarks demonstrate that the targeted instruction data generated by our method are more effective in mitigating hallucinations compared to previous datasets.
arXiv Detail & Related papers (2024-04-16T07:14:32Z)
The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models [24.11077502209129]
Large Language Models (LLMs) have transformed the Natural Language Processing (NLP) landscape with their remarkable ability to understand and generate human-like text. However, these models are prone to hallucinations'' -- outputs that do not align with factual reality or the input context. This paper introduces the Hallucinations Leaderboard, an open initiative to quantitatively measure and compare the tendency of each model to produce hallucinations.
arXiv Detail & Related papers (2024-04-08T23:16:22Z)
Quantity Matters: Towards Assessing and Mitigating Number Hallucination in Large Vision-Language Models [57.42800112251644]
We focus on a specific type of hallucination-number hallucination, referring to models incorrectly identifying the number of certain objects in pictures. We devise a training approach aimed at improving consistency to reduce number hallucinations, which leads to an 8% enhancement in performance over direct finetuning methods.
arXiv Detail & Related papers (2024-03-03T02:31:11Z)
AutoHall: Automated Hallucination Dataset Generation for Large Language Models [56.92068213969036]
This paper introduces a method for automatically constructing model-specific hallucination datasets based on existing fact-checking datasets called AutoHall. We also propose a zero-resource and black-box hallucination detection method based on self-contradiction.
arXiv Detail & Related papers (2023-09-30T05:20:02Z)
Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search [54.286450484332505]
We analyze the connection between hallucinations and training data, and find evidence that models hallucinate because they train on target summaries that are unsupported by the source. We present PINOCCHIO, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations.
arXiv Detail & Related papers (2022-03-16T07:13:52Z)
Improving Faithfulness in Abstractive Summarization with Contrast Candidate Generation and Selection [54.38512834521367]
We study contrast candidate generation and selection as a model-agnostic post-processing technique. We learn a discriminative correction model by generating alternative candidate summaries. This model is then used to select the best candidate as the final output summary.
arXiv Detail & Related papers (2021-04-19T05:39:24Z)
On Hallucination and Predictive Uncertainty in Conditional Language Generation [76.18783678114325]
Higher predictive uncertainty corresponds to a higher chance of hallucination. Epistemic uncertainty is more indicative of hallucination than aleatoric or total uncertainties. It helps to achieve better results of trading performance in standard metric for less hallucination with the proposed beam search variant.
arXiv Detail & Related papers (2021-03-28T00:32:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.