DnDScore: Decontextualization and Decomposition for Factuality Verification in Long-Form Text Generation
- URL: http://arxiv.org/abs/2412.13175v1
- Date: Tue, 17 Dec 2024 18:54:01 GMT
- Title: DnDScore: Decontextualization and Decomposition for Factuality Verification in Long-Form Text Generation
- Authors: Miriam Wanner, Benjamin Van Durme, Mark Dredze,
- Abstract summary: decomposition and decontextualization have been explored independently, but their interactions in a complete system have not been investigated.
We conduct an evaluation of different decomposition, decontextualization, and verification strategies and find that the choice of strategy matters in the resulting factuality scores.
We introduce DnDScore, a decontextualization aware verification method which validates subclaims in the context of contextual information.
- Score: 48.134780006638984
- License:
- Abstract: The decompose-then-verify strategy for verification of Large Language Model (LLM) generations decomposes claims that are then independently verified. Decontextualization augments text (claims) to ensure it can be verified outside of the original context, enabling reliable verification. While decomposition and decontextualization have been explored independently, their interactions in a complete system have not been investigated. Their conflicting purposes can create tensions: decomposition isolates atomic facts while decontextualization inserts relevant information. Furthermore, a decontextualized subclaim presents a challenge to the verification step: what part of the augmented text should be verified as it now contains multiple atomic facts? We conduct an evaluation of different decomposition, decontextualization, and verification strategies and find that the choice of strategy matters in the resulting factuality scores. Additionally, we introduce DnDScore, a decontextualization aware verification method which validates subclaims in the context of contextual information.
Related papers
- Towards Effective Extraction and Evaluation of Factual Claims [1.8262547855491458]
A common strategy for fact-checking long-form content generated by Large Language Models (LLMs) is extracting simple claims that can be verified independently.
We propose a framework for evaluating claim extraction in the context of fact-checking along with automated, scalable, and replicable methods for applying this framework.
We also introduce Claimify, an LLM-based claim extraction method, and demonstrate that it outperforms existing methods under our evaluation framework.
arXiv Detail & Related papers (2025-02-15T16:58:05Z) - FactLens: Benchmarking Fine-Grained Fact Verification [6.814173254027381]
We advocate for a shift toward fine-grained verification, where complex claims are broken down into smaller sub-claims for individual verification.
We introduce FactLens, a benchmark for evaluating fine-grained fact verification, with metrics and automated evaluators of sub-claim quality.
Our results show alignment between automated FactLens evaluators and human judgments, and we discuss the impact of sub-claim characteristics on the overall verification performance.
arXiv Detail & Related papers (2024-11-08T21:26:57Z) - FarFetched: Entity-centric Reasoning and Claim Validation for the Greek Language based on Textually Represented Environments [0.3874856507026475]
We address the need for automated claim validation based on the aggregated evidence derived from multiple online news sources.
We introduce an entity-centric reasoning framework in which latent connections between events, actions, or statements are revealed.
Our approach tries to fill the gap in automated claim validation for less-resourced languages.
arXiv Detail & Related papers (2024-07-13T13:30:20Z) - FIZZ: Factual Inconsistency Detection by Zoom-in Summary and Zoom-out Document [6.726343173201447]
In this work, we propose highly effective and interpretable factual inconsistency detection method metric Factual Inconsistency Detection by Zoom-in Summary and Zoom-out Document.
We align atomic facts from the summary with the source document through adaptive expansion.
Experimental results demonstrate that our proposed factual consistency checking system significantly outperforms existing systems.
arXiv Detail & Related papers (2024-04-17T09:01:02Z) - Enhancing Argument Structure Extraction with Efficient Leverage of
Contextual Information [79.06082391992545]
We propose an Efficient Context-aware model (ECASE) that fully exploits contextual information.
We introduce a sequence-attention module and distance-weighted similarity loss to aggregate contextual information and argumentative information.
Our experiments on five datasets from various domains demonstrate that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-08T08:47:10Z) - Factually Consistent Summarization via Reinforcement Learning with
Textual Entailment Feedback [57.816210168909286]
We leverage recent progress on textual entailment models to address this problem for abstractive summarization systems.
We use reinforcement learning with reference-free, textual entailment rewards to optimize for factual consistency.
Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience, and conciseness of the generated summaries.
arXiv Detail & Related papers (2023-05-31T21:04:04Z) - Give Me More Details: Improving Fact-Checking with Latent Retrieval [58.706972228039604]
Evidence plays a crucial role in automated fact-checking.
Existing fact-checking systems either assume the evidence sentences are given or use the search snippets returned by the search engine.
We propose to incorporate full text from source documents as evidence and introduce two enriched datasets.
arXiv Detail & Related papers (2023-05-25T15:01:19Z) - WiCE: Real-World Entailment for Claims in Wikipedia [63.234352061821625]
We propose WiCE, a new fine-grained textual entailment dataset built on natural claim and evidence pairs extracted from Wikipedia.
In addition to standard claim-level entailment, WiCE provides entailment judgments over sub-sentence units of the claim.
We show that real claims in our dataset involve challenging verification and retrieval problems that existing models fail to address.
arXiv Detail & Related papers (2023-03-02T17:45:32Z) - Generating Fact Checking Summaries for Web Claims [8.980876474818153]
We present a neural attention-based approach that learns to establish the correctness of textual claims based on evidence in the form of text documents.
We show the efficacy of our approach on datasets concerning political, healthcare, and environmental issues.
arXiv Detail & Related papers (2020-10-16T18:10:47Z) - Evaluating Factuality in Generation with Dependency-level Entailment [57.5316011554622]
We propose a new formulation of entailment that decomposes it at the level of dependency arcs.
We show that our dependency arc entailment model trained on this data can identify factual inconsistencies in paraphrasing and summarization better than sentence-level methods.
arXiv Detail & Related papers (2020-10-12T06:43:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.