JointCQ: Improving Factual Hallucination Detection with Joint Claim and Query Generation
- URL: http://arxiv.org/abs/2510.19310v1
- Date: Wed, 22 Oct 2025 07:15:37 GMT
- Title: JointCQ: Improving Factual Hallucination Detection with Joint Claim and Query Generation
- Authors: Fan Xu, Huixuan Zhang, Zhenliang Zhang, Jiahao Wang, Xiaojun Wan,
- Abstract summary: Current large language models (LLMs) often suffer from hallucination issues, i.e., generating content that appears factual but is actually unreliable.<n>In this work, we introduce JointCQ, a joint claim-and-query generation framework designed to construct an effective and efficient claim-query generator.<n>Our framework leverages elaborately designed evaluation criteria to filter synthesized training data, and finetunes a language model for joint claim extraction and query generation.
- Score: 40.64428172310572
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current large language models (LLMs) often suffer from hallucination issues, i,e, generating content that appears factual but is actually unreliable. A typical hallucination detection pipeline involves response decomposition (i.e., claim extraction), query generation, evidence collection (i.e., search or retrieval), and claim verification. However, existing methods exhibit limitations in the first two stages, such as context loss during claim extraction and low specificity in query generation, resulting in degraded performance across the hallucination detection pipeline. In this work, we introduce JointCQ https://github.com/pku0xff/JointCQ, a joint claim-and-query generation framework designed to construct an effective and efficient claim-query generator. Our framework leverages elaborately designed evaluation criteria to filter synthesized training data, and finetunes a language model for joint claim extraction and query generation, providing reliable and informative inputs for downstream search and verification. Experimental results demonstrate that our method outperforms previous methods on multiple open-domain QA hallucination detection benchmarks, advancing the goal of more trustworthy and transparent language model systems.
Related papers
- VeriCite: Towards Reliable Citations in Retrieval-Augmented Generation via Rigorous Verification [107.75781898355562]
We introduce a novel framework, called VeriCite, designed to rigorously validate supporting evidence and enhance answer attribution.<n>We conduct experiments across five open-source LLMs and four datasets, demonstrating that VeriCite can significantly improve citation quality while maintaining the correctness of the answers.
arXiv Detail & Related papers (2025-10-13T13:38:54Z) - Faithfulness-Aware Uncertainty Quantification for Fact-Checking the Output of Retrieval Augmented Generation [108.13261761812517]
We introduce FRANQ (Faithfulness-based Retrieval Augmented UNcertainty Quantification), a novel method for hallucination detection in RAG outputs.<n>We present a new long-form Question Answering (QA) dataset annotated for both factuality and faithfulness.
arXiv Detail & Related papers (2025-05-27T11:56:59Z) - ORION Grounded in Context: Retrieval-Based Method for Hallucination Detection [2.2298542726382276]
"Grounded in Context" is a framework for hallucination detection.<n>Inspired by RAG architecture, our method integrates retrieval andNLI models to predict factual consistency.<n>Our framework identifies unsupported claims with an F1 score of 0.83 in RAGTruth's response-level classification task.
arXiv Detail & Related papers (2025-04-22T10:28:23Z) - Don't Let It Hallucinate: Premise Verification via Retrieval-Augmented Logical Reasoning [19.30729301157088]
We propose a retrieval-based framework that identifies and addresses false premises before generation.<n> Experiments show that this approach effectively reduces hallucinations, improves factual accuracy, and does not require access to model logits or large-scale fine-tuning.
arXiv Detail & Related papers (2025-04-08T21:14:48Z) - HalluCounter: Reference-free LLM Hallucination Detection in the Wild! [6.5037356041929675]
HalluCounter is a reference-free hallucination detection method that utilizes both response-response and query-response consistency and alignment patterns.<n>Our method outperforms state-of-the-art approaches by a significant margin, achieving over 90% average confidence in hallucination detection across datasets.
arXiv Detail & Related papers (2025-03-06T16:59:18Z) - RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation [21.764973680014368]
RetroLLM is a unified framework that integrates retrieval and generation into a single, cohesive process.<n>To mitigate false pruning in the process of constrained evidence generation, we introduce hierarchical FM-Index constraints.<n>Experiments on five open-domain QA datasets demonstrate RetroLLM's superior performance across both in-domain and out-of-domain tasks.
arXiv Detail & Related papers (2024-12-16T16:03:25Z) - Learning to Filter Context for Retrieval-Augmented Generation [75.18946584853316]
Generation models are required to generate outputs given partially or entirely irrelevant passages.
FILCO identifies useful context based on lexical and information-theoretic approaches.
It trains context filtering models that can filter retrieved contexts at test time.
arXiv Detail & Related papers (2023-11-14T18:41:54Z) - AutoHall: Automated Hallucination Dataset Generation for Large Language Models [56.92068213969036]
This paper introduces a method for automatically constructing model-specific hallucination datasets based on existing fact-checking datasets called AutoHall.
We also propose a zero-resource and black-box hallucination detection method based on self-contradiction.
arXiv Detail & Related papers (2023-09-30T05:20:02Z) - Detecting Hallucinated Content in Conditional Neural Sequence Generation [165.68948078624499]
We propose a task to predict whether each token in the output sequence is hallucinated (not contained in the input)
We also introduce a method for learning to detect hallucinations using pretrained language models fine tuned on synthetic data.
arXiv Detail & Related papers (2020-11-05T00:18:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.