FaVIQ: FAct Verification from Information-seeking Questions
- URL: http://arxiv.org/abs/2107.02153v1
- Date: Mon, 5 Jul 2021 17:31:44 GMT
- Title: FaVIQ: FAct Verification from Information-seeking Questions
- Authors: Jungsoo Park, Sewon Min, Jaewoo Kang, Luke Zettlemoyer, Hannaneh
Hajishirzi
- Abstract summary: We construct a large-scale fact verification dataset called FaVIQ using information-seeking questions posed by real users.
Our claims are verified to be natural, contain little lexical bias, and require a complete understanding of the evidence for verification.
- Score: 77.7067957445298
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite significant interest in developing general purpose fact checking
models, it is challenging to construct a large-scale fact verification dataset
with realistic claims that would occur in the real world. Existing claims are
either authored by crowdworkers, thereby introducing subtle biases that are
difficult to control for, or manually verified by professional fact checkers,
causing them to be expensive and limited in scale. In this paper, we construct
a challenging, realistic, and large-scale fact verification dataset called
FaVIQ, using information-seeking questions posed by real users who do not know
how to answer. The ambiguity in information-seeking questions enables
automatically constructing true and false claims that reflect confusions arisen
from users (e.g., the year of the movie being filmed vs. being released). Our
claims are verified to be natural, contain little lexical bias, and require a
complete understanding of the evidence for verification. Our experiments show
that the state-of-the-art models are far from solving our new task. Moreover,
training on our data helps in professional fact-checking, outperforming models
trained on the most widely used dataset FEVER or in-domain data by up to 17%
absolute. Altogether, our data will serve as a challenging benchmark for
natural language understanding and support future progress in professional fact
checking.
Related papers
- How We Refute Claims: Automatic Fact-Checking through Flaw
Identification and Explanation [4.376598435975689]
This paper explores the novel task of flaw-oriented fact-checking, including aspect generation and flaw identification.
We also introduce RefuteClaim, a new framework designed specifically for this task.
Given the absence of an existing dataset, we present FlawCheck, a dataset created by extracting and transforming insights from expert reviews into relevant aspects and identified flaws.
arXiv Detail & Related papers (2024-01-27T06:06:16Z) - EX-FEVER: A Dataset for Multi-hop Explainable Fact Verification [22.785622371421876]
We present a pioneering dataset for multi-hop explainable fact verification.
With over 60,000 claims involving 2-hop and 3-hop reasoning, each is created by summarizing and modifying information from hyperlinked Wikipedia documents.
We demonstrate a novel baseline system on our EX-FEVER dataset, showcasing document retrieval, explanation generation, and claim verification.
arXiv Detail & Related papers (2023-10-15T06:46:15Z) - FactLLaMA: Optimizing Instruction-Following Language Models with
External Knowledge for Automated Fact-Checking [10.046323978189847]
We propose combining the power of instruction-following language models with external evidence retrieval to enhance fact-checking performance.
Our approach involves leveraging search engines to retrieve relevant evidence for a given input claim.
Then, we instruct-tune an open-sourced language model, called LLaMA, using this evidence, enabling it to predict the veracity of the input claim more accurately.
arXiv Detail & Related papers (2023-09-01T04:14:39Z) - Mitigating Temporal Misalignment by Discarding Outdated Facts [58.620269228776294]
Large language models are often used under temporal misalignment, tasked with answering questions about the present.
We propose fact duration prediction: the task of predicting how long a given fact will remain true.
Our data and code are released publicly at https://github.com/mikejqzhang/mitigating_misalignment.
arXiv Detail & Related papers (2023-05-24T07:30:08Z) - WiCE: Real-World Entailment for Claims in Wikipedia [63.234352061821625]
We propose WiCE, a new fine-grained textual entailment dataset built on natural claim and evidence pairs extracted from Wikipedia.
In addition to standard claim-level entailment, WiCE provides entailment judgments over sub-sentence units of the claim.
We show that real claims in our dataset involve challenging verification and retrieval problems that existing models fail to address.
arXiv Detail & Related papers (2023-03-02T17:45:32Z) - Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for
Misinformation [67.69725605939315]
Misinformation emerges in times of uncertainty when credible information is limited.
This is challenging for NLP-based fact-checking as it relies on counter-evidence, which may not yet be available.
arXiv Detail & Related papers (2022-10-25T09:40:48Z) - Generating Literal and Implied Subquestions to Fact-check Complex Claims [64.81832149826035]
We focus on decomposing a complex claim into a comprehensive set of yes-no subquestions whose answers influence the veracity of the claim.
We present ClaimDecomp, a dataset of decompositions for over 1000 claims.
We show that these subquestions can help identify relevant evidence to fact-check the full claim and derive the veracity through their answers.
arXiv Detail & Related papers (2022-05-14T00:40:57Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.