FarFetched: Entity-centric Reasoning and Claim Validation for the Greek Language based on Textually Represented Environments
- URL: http://arxiv.org/abs/2407.09888v1
- Date: Sat, 13 Jul 2024 13:30:20 GMT
- Title: FarFetched: Entity-centric Reasoning and Claim Validation for the Greek Language based on Textually Represented Environments
- Authors: Dimitris Papadopoulos, Katerina Metropoulou, Nikolaos Matsatsinis, Nikolaos Papadakis,
- Abstract summary: We address the need for automated claim validation based on the aggregated evidence derived from multiple online news sources.
We introduce an entity-centric reasoning framework in which latent connections between events, actions, or statements are revealed.
Our approach tries to fill the gap in automated claim validation for less-resourced languages.
- Score: 0.3874856507026475
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Our collective attention span is shortened by the flood of online information. With \textit{FarFetched}, we address the need for automated claim validation based on the aggregated evidence derived from multiple online news sources. We introduce an entity-centric reasoning framework in which latent connections between events, actions, or statements are revealed via entity mentions and represented in a graph database. Using entity linking and semantic similarity, we offer a way for collecting and combining information from diverse sources in order to generate evidence relevant to the user's claim. Then, we leverage textual entailment recognition to quantitatively determine whether this assertion is credible, based on the created evidence. Our approach tries to fill the gap in automated claim validation for less-resourced languages and is showcased on the Greek language, complemented by the training of relevant semantic textual similarity (STS) and natural language inference (NLI) models that are evaluated on translated versions of common benchmarks.
Related papers
- Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning [84.94709351266557]
We focus on the trustworthiness of language models with respect to retrieval augmentation.
We deem that retrieval-augmented language models have the inherent capabilities of supplying response according to both contextual and parametric knowledge.
Inspired by aligning language models with human preference, we take the first step towards aligning retrieval-augmented language models to a status where it responds relying merely on the external evidence.
arXiv Detail & Related papers (2024-10-22T09:25:21Z) - Context versus Prior Knowledge in Language Models [49.17879668110546]
Language models often need to integrate prior knowledge learned during pretraining and new information presented in context.
We propose two mutual information-based metrics to measure a model's dependency on a context and on its prior about an entity.
arXiv Detail & Related papers (2024-04-06T13:46:53Z) - Unsupervised Pretraining for Fact Verification by Language Model
Distillation [4.504050940874427]
We propose SFAVEL (Self-supervised Fact Verification via Language Model Distillation), a novel unsupervised pretraining framework.
It distils self-supervised features into high-quality claim-fact alignments without the need for annotations.
This is enabled by a novel contrastive loss function that encourages features to attain high-quality claim and evidence alignments.
arXiv Detail & Related papers (2023-09-28T15:53:44Z) - Natural Language Decompositions of Implicit Content Enable Better Text
Representations [56.85319224208865]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account.
We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed.
Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z) - Contextual information integration for stance detection via
cross-attention [59.662413798388485]
Stance detection deals with identifying an author's stance towards a target.
Most existing stance detection models are limited because they do not consider relevant contextual information.
We propose an approach to integrate contextual information as text.
arXiv Detail & Related papers (2022-11-03T15:04:29Z) - Improve Discourse Dependency Parsing with Contextualized Representations [28.916249926065273]
We propose to take advantage of transformers to encode contextualized representations of units of different levels.
Motivated by the observation of writing patterns commonly shared across articles, we propose a novel method that treats discourse relation identification as a sequence labelling task.
arXiv Detail & Related papers (2022-05-04T14:35:38Z) - Graph-based Retrieval for Claim Verification over Cross-Document
Evidence [0.6853165736531939]
We conjecture that a graph-based approach can be beneficial to identify fragmented evidence.
We tested this hypothesis by building, over the whole corpus, a large graph that interconnects text portions by means of mentioned entities.
Our experiments show that leveraging on a graph structure is beneficial in identifying a reasonably small portion of passages related to a claim.
arXiv Detail & Related papers (2021-09-13T14:54:26Z) - Claim Matching Beyond English to Scale Global Fact-Checking [5.836354423653351]
We construct a novel dataset of WhatsApp tipline and public group messages alongside fact-checked claims.
Our dataset contains content in high-resource (English, Hindi) and lower-resource (Bengali, Malayalam, Tamil) languages.
We train our own embedding model using knowledge distillation and a high-quality "teacher" model in order to address the imbalance in embedding quality between the low- and high-resource languages.
arXiv Detail & Related papers (2021-06-01T23:28:05Z) - Evidence-Aware Inferential Text Generation with Vector Quantised
Variational AutoEncoder [104.25716317141321]
We propose an approach that automatically finds evidence for an event from a large text corpus, and leverages the evidence to guide the generation of inferential texts.
Our approach provides state-of-the-art performance on both Event2Mind and ATOMIC datasets.
arXiv Detail & Related papers (2020-06-15T02:59:52Z) - Commonsense Evidence Generation and Injection in Reading Comprehension [57.31927095547153]
We propose a Commonsense Evidence Generation and Injection framework in reading comprehension, named CEGI.
The framework injects two kinds of auxiliary commonsense evidence into comprehensive reading to equip the machine with the ability of rational thinking.
Experiments on the CosmosQA dataset demonstrate that the proposed CEGI model outperforms the current state-of-the-art approaches.
arXiv Detail & Related papers (2020-05-11T16:31:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.