Scalable Fact-checking with Human-in-the-Loop
- URL: http://arxiv.org/abs/2109.10992v1
- Date: Wed, 22 Sep 2021 19:19:59 GMT
- Title: Scalable Fact-checking with Human-in-the-Loop
- Authors: Jing Yang, Didier Vega-Oliveros, Tais Seibt and Anderson Rocha
- Abstract summary: Intending to accelerate fact-checking, we bridge this gap by grouping similar messages and summarizing them into aggregated claims.
The results show the potential to speed up the fact-checking process by organizing and selecting representative claims from massive disorganized and redundant messages.
- Score: 17.1138216746642
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Researchers have been investigating automated solutions for fact-checking in
a variety of fronts. However, current approaches often overlook the fact that
the amount of information released every day is escalating, and a large amount
of them overlap. Intending to accelerate fact-checking, we bridge this gap by
grouping similar messages and summarizing them into aggregated claims.
Specifically, we first clean a set of social media posts (e.g., tweets) and
build a graph of all posts based on their semantics; Then, we perform two
clustering methods to group the messages for further claim summarization. We
evaluate the summaries both quantitatively with ROUGE scores and qualitatively
with human evaluation. We also generate a graph of summaries to verify that
there is no significant overlap among them. The results reduced 28,818 original
messages to 700 summary claims, showing the potential to speed up the
fact-checking process by organizing and selecting representative claims from
massive disorganized and redundant messages.
Related papers
- Incremental Extractive Opinion Summarization Using Cover Trees [81.59625423421355]
In online marketplaces user reviews accumulate over time, and opinion summaries need to be updated periodically.
In this work, we study the task of extractive opinion summarization in an incremental setting.
We present an efficient algorithm for accurately computing the CentroidRank summaries in an incremental setting.
arXiv Detail & Related papers (2024-01-16T02:00:17Z) - From Chaos to Clarity: Claim Normalization to Empower Fact-Checking [57.024192702939736]
Claim Normalization (aka ClaimNorm) aims to decompose complex and noisy social media posts into more straightforward and understandable forms.
We propose CACN, a pioneering approach that leverages chain-of-thought and claim check-worthiness estimation.
Our experiments demonstrate that CACN outperforms several baselines across various evaluation measures.
arXiv Detail & Related papers (2023-10-22T16:07:06Z) - USB: A Unified Summarization Benchmark Across Tasks and Domains [68.82726887802856]
We introduce a Wikipedia-derived benchmark, complemented by a rich set of crowd-sourced annotations, that supports $8$ interrelated tasks.
We compare various methods on this benchmark and discover that on multiple tasks, moderately-sized fine-tuned models consistently outperform much larger few-shot prompted language models.
arXiv Detail & Related papers (2023-05-23T17:39:54Z) - Harnessing Abstractive Summarization for Fact-Checked Claim Detection [8.49182897482236]
Social media platforms have become new battlegrounds for anti-social elements, with misinformation being the weapon of choice.
We believe that the solution lies in partial automation of the fact-checking life cycle, saving human time for tasks which require high cognition.
We propose a new workflow for efficiently detecting previously fact-checked claims that uses abstractive summarization to generate crisp queries.
arXiv Detail & Related papers (2022-09-10T07:32:36Z) - Improved Topic modeling in Twitter through Community Pooling [0.0]
Twitter posts are short and often less coherent than other text documents.
We propose a new pooling scheme for topic modeling in Twitter, which groups tweets whose authors belong to the same community.
Results show that our Community polling method outperformed other methods on the majority of metrics in two heterogeneous datasets.
arXiv Detail & Related papers (2021-12-20T17:05:32Z) - AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer
Summarization [73.91543616777064]
Community Question Answering (CQA) fora such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of community-based questions.
One goal of answer summarization is to produce a summary that reflects the range of answer perspectives.
This work introduces a novel dataset of 4,631 CQA threads for answer summarization, curated by professional linguists.
arXiv Detail & Related papers (2021-11-11T21:48:02Z) - Author Clustering and Topic Estimation for Short Texts [69.54017251622211]
We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document.
We also simultaneously cluster users, removing the need for post-hoc cluster estimation.
Our method performs as well as -- or better -- than traditional approaches to problems arising in short text.
arXiv Detail & Related papers (2021-06-15T20:55:55Z) - Word Embedding-based Text Processing for Comprehensive Summarization and
Distinct Information Extraction [1.552282932199974]
We propose two automated text processing frameworks specifically designed to analyze online reviews.
The first framework is to summarize the reviews dataset by extracting essential sentence.
The second framework is based on a question-answering neural network model trained to extract answers to multiple different questions.
arXiv Detail & Related papers (2020-04-21T02:43:31Z) - Generating Fact Checking Explanations [52.879658637466605]
A crucial piece of the puzzle that is still missing is to understand how to automate the most elaborate part of the process.
This paper provides the first study of how these explanations can be generated automatically based on available claim context.
Our results indicate that optimising both objectives at the same time, rather than training them separately, improves the performance of a fact checking system.
arXiv Detail & Related papers (2020-04-13T05:23:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.