HoVer: A Dataset for Many-Hop Fact Extraction And Claim Verification
- URL: http://arxiv.org/abs/2011.03088v2
- Date: Mon, 16 Nov 2020 01:57:39 GMT
- Title: HoVer: A Dataset for Many-Hop Fact Extraction And Claim Verification
- Authors: Yichen Jiang, Shikha Bordia, Zheng Zhong, Charles Dognin, Maneesh
Singh, Mohit Bansal
- Abstract summary: HoVer is a dataset for many-hop evidence extraction and fact verification.
It challenges models to extract facts from several Wikipedia articles that are relevant to a claim.
Most of the 3/4-hop claims are written in multiple sentences, which adds to the complexity of understanding long-range dependency relations.
- Score: 74.66819506353086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce HoVer (HOppy VERification), a dataset for many-hop evidence
extraction and fact verification. It challenges models to extract facts from
several Wikipedia articles that are relevant to a claim and classify whether
the claim is Supported or Not-Supported by the facts. In HoVer, the claims
require evidence to be extracted from as many as four English Wikipedia
articles and embody reasoning graphs of diverse shapes. Moreover, most of the
3/4-hop claims are written in multiple sentences, which adds to the complexity
of understanding long-range dependency relations such as coreference. We show
that the performance of an existing state-of-the-art semantic-matching model
degrades significantly on our dataset as the number of reasoning hops
increases, hence demonstrating the necessity of many-hop reasoning to achieve
strong results. We hope that the introduction of this challenging dataset and
the accompanying evaluation task will encourage research in many-hop fact
retrieval and information verification. We make the HoVer dataset publicly
available at https://hover-nlp.github.io
Related papers
- Contrastive Learning to Improve Retrieval for Real-world Fact Checking [84.57583869042791]
We present Contrastive Fact-Checking Reranker (CFR), an improved retriever for fact-checking complex claims.
We leverage the AVeriTeC dataset, which annotates subquestions for claims with human written answers from evidence documents.
We find a 6% improvement in veracity classification accuracy on the dataset.
arXiv Detail & Related papers (2024-10-07T00:09:50Z) - EX-FEVER: A Dataset for Multi-hop Explainable Fact Verification [22.785622371421876]
We present a pioneering dataset for multi-hop explainable fact verification.
With over 60,000 claims involving 2-hop and 3-hop reasoning, each is created by summarizing and modifying information from hyperlinked Wikipedia documents.
We demonstrate a novel baseline system on our EX-FEVER dataset, showcasing document retrieval, explanation generation, and claim verification.
arXiv Detail & Related papers (2023-10-15T06:46:15Z) - Locate Then Ask: Interpretable Stepwise Reasoning for Multi-hop Question
Answering [71.49131159045811]
Multi-hop reasoning requires aggregating multiple documents to answer a complex question.
Existing methods usually decompose the multi-hop question into simpler single-hop questions.
We propose an interpretable stepwise reasoning framework to incorporate both single-hop supporting sentence identification and single-hop question generation.
arXiv Detail & Related papers (2022-08-22T13:24:25Z) - Generating Literal and Implied Subquestions to Fact-check Complex Claims [64.81832149826035]
We focus on decomposing a complex claim into a comprehensive set of yes-no subquestions whose answers influence the veracity of the claim.
We present ClaimDecomp, a dataset of decompositions for over 1000 claims.
We show that these subquestions can help identify relevant evidence to fact-check the full claim and derive the veracity through their answers.
arXiv Detail & Related papers (2022-05-14T00:40:57Z) - DialFact: A Benchmark for Fact-Checking in Dialogue [56.63709206232572]
We construct DialFact, a benchmark dataset of 22,245 annotated conversational claims, paired with pieces of evidence from Wikipedia.
We find that existing fact-checking models trained on non-dialogue data like FEVER fail to perform well on our task.
We propose a simple yet data-efficient solution to effectively improve fact-checking performance in dialogue.
arXiv Detail & Related papers (2021-10-15T17:34:35Z) - Few Shot Learning for Information Verification [0.0]
We aim to verify facts based on evidence selected from a list of articles taken from Wikipedia.
In this research, we aim to verify facts based on evidence selected from a list of articles taken from Wikipedia.
arXiv Detail & Related papers (2021-02-22T12:56:12Z) - Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of
Reasoning Steps [31.472490306390977]
A multi-hop question answering dataset aims to test reasoning and inference skills by requiring a model to read multiple paragraphs to answer a given question.
Previous studies revealed that many examples in existing multi-hop datasets do not require multi-hop reasoning to answer a question.
We present a new multi-hop QA dataset, called 2WikiMultiHopQA, which uses structured and unstructured data.
arXiv Detail & Related papers (2020-11-02T15:42:40Z) - Multi-Hop Fact Checking of Political Claims [43.25708842000248]
We study more complex claim verification of naturally occurring claims with multiple hops over interconnected evidence chunks.
We construct a small annotated dataset, PolitiHop, of evidence sentences for claim verification.
We find that the task is complex and achieve the best performance with an architecture that specifically models reasoning over evidence pieces.
arXiv Detail & Related papers (2020-09-10T13:54:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.