A Review on Fact Extraction and Verification
- URL: http://arxiv.org/abs/2010.03001v5
- Date: Fri, 19 Nov 2021 14:42:58 GMT
- Title: A Review on Fact Extraction and Verification
- Authors: Giannis Bekoulis, Christina Papagiannopoulou, Nikos Deligiannis
- Abstract summary: We study the fact checking problem, which aims to identify the veracity of a given claim.
We focus on the task of Fact Extraction and VERification (FEVER) and its accompanied dataset.
This task is essential and can be the building block of applications such as fake news detection and medical claim verification.
- Score: 19.373340472113703
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the fact checking problem, which aims to identify the veracity of a
given claim. Specifically, we focus on the task of Fact Extraction and
VERification (FEVER) and its accompanied dataset. The task consists of the
subtasks of retrieving the relevant documents (and sentences) from Wikipedia
and validating whether the information in the documents supports or refutes a
given claim. This task is essential and can be the building block of
applications such as fake news detection and medical claim verification. In
this paper, we aim at a better understanding of the challenges of the task by
presenting the literature in a structured and comprehensive way. We describe
the proposed methods by analyzing the technical perspectives of the different
approaches and discussing the performance results on the FEVER dataset, which
is the most well-studied and formally structured dataset on the fact extraction
and verification task. We also conduct the largest experimental study to date
on identifying beneficial loss functions for the sentence retrieval component.
Our analysis indicates that sampling negative sentences is important for
improving the performance and decreasing the computational complexity. Finally,
we describe open issues and future challenges, and we motivate future research
in the task.
Related papers
- Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance.
We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z) - Augmenting the Veracity and Explanations of Complex Fact Checking via Iterative Self-Revision with LLMs [10.449165630417522]
We construct two complex fact-checking datasets in the Chinese scenarios: CHEF-EG and TrendFact.
These datasets involve complex facts in areas such as health, politics, and society.
We propose a unified framework called FactISR to perform mutual feedback between veracity and explanations.
arXiv Detail & Related papers (2024-10-19T15:25:19Z) - How We Refute Claims: Automatic Fact-Checking through Flaw
Identification and Explanation [4.376598435975689]
This paper explores the novel task of flaw-oriented fact-checking, including aspect generation and flaw identification.
We also introduce RefuteClaim, a new framework designed specifically for this task.
Given the absence of an existing dataset, we present FlawCheck, a dataset created by extracting and transforming insights from expert reviews into relevant aspects and identified flaws.
arXiv Detail & Related papers (2024-01-27T06:06:16Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Enhancing Argument Structure Extraction with Efficient Leverage of
Contextual Information [79.06082391992545]
We propose an Efficient Context-aware model (ECASE) that fully exploits contextual information.
We introduce a sequence-attention module and distance-weighted similarity loss to aggregate contextual information and argumentative information.
Our experiments on five datasets from various domains demonstrate that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-08T08:47:10Z) - Benchmarking the Generation of Fact Checking Explanations [19.363672064425504]
We focus on the generation of justifications (textual explanation of why a claim is classified as either true or false) and benchmark it with novel datasets and advanced baselines.
Results show that in justification production summarization benefits from the claim information.
Although cross-dataset experiments suffer from performance degradation, a unique model trained on a combination of the two datasets is able to retain style information in an efficient manner.
arXiv Detail & Related papers (2023-08-29T10:40:46Z) - Towards Understanding Omission in Dialogue Summarization [45.932368303107104]
Previous works indicated that omission is a major factor in affecting the quality of summarization.
We propose the OLDS dataset, which provides high-quality Omission Labels for Dialogue Summarization.
arXiv Detail & Related papers (2022-11-14T06:56:59Z) - DialFact: A Benchmark for Fact-Checking in Dialogue [56.63709206232572]
We construct DialFact, a benchmark dataset of 22,245 annotated conversational claims, paired with pieces of evidence from Wikipedia.
We find that existing fact-checking models trained on non-dialogue data like FEVER fail to perform well on our task.
We propose a simple yet data-efficient solution to effectively improve fact-checking performance in dialogue.
arXiv Detail & Related papers (2021-10-15T17:34:35Z) - Abstract, Rationale, Stance: A Joint Model for Scientific Claim
Verification [18.330265729989843]
We propose an approach, named as ARSJoint, that jointly learns the modules for the three tasks with a machine reading comprehension framework.
The experimental results on the benchmark dataset SciFact show that our approach outperforms the existing works.
arXiv Detail & Related papers (2021-09-13T10:07:26Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z) - Generating Fact Checking Explanations [52.879658637466605]
A crucial piece of the puzzle that is still missing is to understand how to automate the most elaborate part of the process.
This paper provides the first study of how these explanations can be generated automatically based on available claim context.
Our results indicate that optimising both objectives at the same time, rather than training them separately, improves the performance of a fact checking system.
arXiv Detail & Related papers (2020-04-13T05:23:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.