KEA Explain: Explanations of Hallucinations using Graph Kernel Analysis
- URL: http://arxiv.org/abs/2507.03847v1
- Date: Sat, 05 Jul 2025 00:55:15 GMT
- Title: KEA Explain: Explanations of Hallucinations using Graph Kernel Analysis
- Authors: Reilly Haskins, Ben Adams,
- Abstract summary: Large Language Models (LLMs) frequently generate hallucinations.<n>This research presents KEA ( Kernel-Enriched AI) Explain: a neurosymbolic framework that detects and explains such hallucinations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) frequently generate hallucinations: statements that are syntactically plausible but lack factual grounding. This research presents KEA (Kernel-Enriched AI) Explain: a neurosymbolic framework that detects and explains such hallucinations by comparing knowledge graphs constructed from LLM outputs with ground truth data from Wikidata or contextual documents. Using graph kernels and semantic clustering, the method provides explanations for detected hallucinations, ensuring both robustness and interpretability. Our framework achieves competitive accuracy in detecting hallucinations across both open- and closed-domain tasks, and is able to generate contrastive explanations, enhancing transparency. This research advances the reliability of LLMs in high-stakes domains and provides a foundation for future work on precision improvements and multi-source knowledge integration.
Related papers
- KGHaluBench: A Knowledge Graph-Based Hallucination Benchmark for Evaluating the Breadth and Depth of LLM Knowledge [1.845601051662407]
Large Language Models (LLMs) possess a remarkable capacity to generate persuasive and intelligible language.<n>Existing benchmarks are limited by static and narrow questions, leading to limited coverage and misleading evaluations.<n>We present KGHaluBench, a Knowledge Graph-based hallucination benchmark that assesses LLMs across the breadth and depth of their knowledge.
arXiv Detail & Related papers (2026-02-23T09:41:46Z) - Detecting Hallucinations in Graph Retrieval-Augmented Generation via Attention Patterns and Semantic Alignment [36.45654492179688]
Graph-based Retrieval-Augmented Generation (GraphRAG) enhances Large Language Models (LLMs)<n>LLMs struggle to interpret the relational and topological information in inputs, resulting in hallucinations that are inconsistent with the retrieved knowledge.<n>We propose two lightweight interpretability metrics: Path Reliance Degree (PRD), which measures over-reliance on shortest-path triples, and Semantic Alignment Score (SAS), which assesses how well the model's internal representations align with the retrieved knowledge.
arXiv Detail & Related papers (2025-12-09T21:52:50Z) - Large Language Models Hallucination: A Comprehensive Survey [3.8100688074986095]
Large language models (LLMs) have transformed natural language processing, achieving remarkable performance across diverse tasks.<n>Their impressive fluency often comes at the cost of producing false or fabricated information, a phenomenon known as hallucination.<n>This survey provides a comprehensive review of research on hallucination in LLMs, with a focus on causes, detection, and mitigation.
arXiv Detail & Related papers (2025-10-05T20:26:38Z) - SHALE: A Scalable Benchmark for Fine-grained Hallucination Evaluation in LVLMs [52.03164192840023]
Large Vision-Language Models (LVLMs) still suffer from hallucinations, i.e., generating content inconsistent with input or established world knowledge.<n>We propose an automated data construction pipeline that produces scalable, controllable, and diverse evaluation data.<n>We construct SHALE, a benchmark designed to assess both faithfulness and factuality hallucinations.
arXiv Detail & Related papers (2025-08-13T07:58:01Z) - Theoretical Foundations and Mitigation of Hallucination in Large Language Models [0.0]
Hallucination in Large Language Models (LLMs) refers to the generation of content that is not faithful to the input or the real-world facts.<n>This paper provides a rigorous treatment of hallucination in LLMs, including formal definitions and theoretical analyses.
arXiv Detail & Related papers (2025-07-20T15:22:34Z) - Triggering Hallucinations in LLMs: A Quantitative Study of Prompt-Induced Hallucination in Large Language Models [0.0]
Hallucinations in large language models (LLMs) present a growing challenge across real-world applications.<n>We propose a prompt-based framework to systematically trigger and quantify hallucination.
arXiv Detail & Related papers (2025-05-01T14:33:47Z) - Towards Long Context Hallucination Detection [49.195854802543714]
Large Language Models (LLMs) have demonstrated remarkable performance across various tasks.<n>They are prone to contextual hallucination, generating information that is either unsubstantiated or contradictory to the given context.<n>We propose a novel architecture that enables pre-trained encoder models, such as BERT, to process long contexts and effectively detect contextual hallucinations.
arXiv Detail & Related papers (2025-04-28T03:47:05Z) - HalluLens: LLM Hallucination Benchmark [49.170128733508335]
Large language models (LLMs) often generate responses that deviate from user input or training data, a phenomenon known as "hallucination"<n>This paper introduces a comprehensive hallucination benchmark, incorporating both new extrinsic and existing intrinsic evaluation tasks.
arXiv Detail & Related papers (2025-04-24T13:40:27Z) - Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations [82.42811602081692]
This paper introduces a subsequence association framework to systematically trace and understand hallucinations.<n>Key insight is hallucinations that arise when dominant hallucinatory associations outweigh faithful ones.<n>We propose a tracing algorithm that identifies causal subsequences by analyzing hallucination probabilities across randomized input contexts.
arXiv Detail & Related papers (2025-04-17T06:34:45Z) - HalluciNot: Hallucination Detection Through Context and Common Knowledge Verification [40.69033997154463]
This paper introduces a comprehensive system for detecting hallucinations in large language model (LLM) outputs in enterprise settings.<n>We present a novel taxonomy of LLM responses specific to hallucination in enterprise applications, categorizing them into context-based, common knowledge, enterprise-specific, and innocuous statements.<n>Our hallucination detection model HDM-2 validates LLM responses with respect to both context and generally known facts (common knowledge)
arXiv Detail & Related papers (2025-04-09T17:39:41Z) - Hallucination Detection: A Probabilistic Framework Using Embeddings Distance Analysis [2.089191490381739]
We introduce a mathematically sound methodology to reason about hallucination, and leverage it to build a tool to detect hallucinations.<n>To the best of our knowledge, we are the first to show that hallucinated content has structural differences with respect to correct content.<n>We leverage these structural differences to develop a tool to detect hallucinated responses, achieving an accuracy of 66% for a specific configuration of system parameters.
arXiv Detail & Related papers (2025-02-10T09:44:13Z) - GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework [1.9286785775296298]
We present GraphEval: a hallucination evaluation framework based on representing information in Knowledge Graph structures.
Using our approach in conjunction with state-of-the-art natural language inference (NLI) models leads to an improvement in balanced accuracy on various hallucination benchmarks.
arXiv Detail & Related papers (2024-07-15T15:11:16Z) - Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models [68.91592125175787]
Hallucinations pose a significant challenge for the practical implementation of large language models (LLMs)
We present Rowen, a novel approach that enhances LLMs with a selective retrieval augmentation process tailored to address hallucinations.
arXiv Detail & Related papers (2024-02-16T11:55:40Z) - Knowledge Verification to Nip Hallucination in the Bud [69.79051730580014]
We demonstrate the feasibility of mitigating hallucinations by verifying and minimizing the inconsistency between external knowledge present in the alignment data and the intrinsic knowledge embedded within foundation LLMs.
We propose a novel approach called Knowledge Consistent Alignment (KCA), which employs a well-aligned LLM to automatically formulate assessments based on external knowledge.
We demonstrate the superior efficacy of KCA in reducing hallucinations across six benchmarks, utilizing foundation LLMs of varying backbones and scales.
arXiv Detail & Related papers (2024-01-19T15:39:49Z) - FactCHD: Benchmarking Fact-Conflicting Hallucination Detection [64.4610684475899]
FactCHD is a benchmark designed for the detection of fact-conflicting hallucinations from LLMs.
FactCHD features a diverse dataset that spans various factuality patterns, including vanilla, multi-hop, comparison, and set operation.
We introduce Truth-Triangulator that synthesizes reflective considerations by tool-enhanced ChatGPT and LoRA-tuning based on Llama2.
arXiv Detail & Related papers (2023-10-18T16:27:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.