Give Me More Details: Improving Fact-Checking with Latent Retrieval
- URL: http://arxiv.org/abs/2305.16128v2
- Date: Sat, 27 Jan 2024 16:43:52 GMT
- Title: Give Me More Details: Improving Fact-Checking with Latent Retrieval
- Authors: Xuming Hu, Junzhe Chen, Zhijiang Guo, Philip S. Yu
- Abstract summary: Evidence plays a crucial role in automated fact-checking.
Existing fact-checking systems either assume the evidence sentences are given or use the search snippets returned by the search engine.
We propose to incorporate full text from source documents as evidence and introduce two enriched datasets.
- Score: 58.706972228039604
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Evidence plays a crucial role in automated fact-checking. When verifying
real-world claims, existing fact-checking systems either assume the evidence
sentences are given or use the search snippets returned by the search engine.
Such methods ignore the challenges of collecting evidence and may not provide
sufficient information to verify real-world claims. Aiming at building a better
fact-checking system, we propose to incorporate full text from source documents
as evidence and introduce two enriched datasets. The first one is a
multilingual dataset, while the second one is monolingual (English). We further
develop a latent variable model to jointly extract evidence sentences from
documents and perform claim verification. Experiments indicate that including
source documents can provide sufficient contextual clues even when gold
evidence sentences are not annotated. The proposed system is able to achieve
significant improvements upon best-reported models under different settings.
Related papers
- RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict [34.2739191920746]
High-quality evidence plays a vital role in enhancing fact-checking systems.
We propose a method based on a Large Language Model to automatically retrieve and summarize evidence from the Web.
We construct RU22Fact, a novel explainable fact-checking dataset on the Russia-Ukraine conflict in 2022 of 16K samples.
arXiv Detail & Related papers (2024-03-25T11:56:29Z) - EX-FEVER: A Dataset for Multi-hop Explainable Fact Verification [22.785622371421876]
We present a pioneering dataset for multi-hop explainable fact verification.
With over 60,000 claims involving 2-hop and 3-hop reasoning, each is created by summarizing and modifying information from hyperlinked Wikipedia documents.
We demonstrate a novel baseline system on our EX-FEVER dataset, showcasing document retrieval, explanation generation, and claim verification.
arXiv Detail & Related papers (2023-10-15T06:46:15Z) - Complex Claim Verification with Evidence Retrieved in the Wild [73.19998942259073]
We present the first fully automated pipeline to check real-world claims by retrieving raw evidence from the web.
Our pipeline includes five components: claim decomposition, raw document retrieval, fine-grained evidence retrieval, claim-focused summarization, and veracity judgment.
arXiv Detail & Related papers (2023-05-19T17:49:19Z) - Read it Twice: Towards Faithfully Interpretable Fact Verification by
Revisiting Evidence [59.81749318292707]
We propose a fact verification model named ReRead to retrieve evidence and verify claim.
The proposed system is able to achieve significant improvements upon best-reported models under different settings.
arXiv Detail & Related papers (2023-05-02T03:23:14Z) - GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion.
The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z) - Synthetic Disinformation Attacks on Automated Fact Verification Systems [53.011635547834025]
We explore the sensitivity of automated fact-checkers to synthetic adversarial evidence in two simulated settings.
We show that these systems suffer significant performance drops against these attacks.
We discuss the growing threat of modern NLG systems as generators of disinformation.
arXiv Detail & Related papers (2022-02-18T19:01:01Z) - DialFact: A Benchmark for Fact-Checking in Dialogue [56.63709206232572]
We construct DialFact, a benchmark dataset of 22,245 annotated conversational claims, paired with pieces of evidence from Wikipedia.
We find that existing fact-checking models trained on non-dialogue data like FEVER fail to perform well on our task.
We propose a simple yet data-efficient solution to effectively improve fact-checking performance in dialogue.
arXiv Detail & Related papers (2021-10-15T17:34:35Z) - Hierarchical Evidence Set Modeling for Automated Fact Extraction and
Verification [5.836068916903788]
Hierarchical Evidence Set Modeling (HESM) is a framework to extract evidence sets and verify a claim to be supported, refuted or not enough info.
Our experimental results show that HESM outperforms 7 state-of-the-art methods for fact extraction and claim verification.
arXiv Detail & Related papers (2020-10-10T22:27:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.