Related papers: Claim Detection for Automated Fact-checking: A Survey on Monolingual, Multilingual and Cross-Lingual Research

Claim Detection for Automated Fact-checking: A Survey on Monolingual, Multilingual and Cross-Lingual Research

URL: http://arxiv.org/abs/2401.11969v3
Date: Mon, 18 Mar 2024 16:49:59 GMT
Title: Claim Detection for Automated Fact-checking: A Survey on Monolingual, Multilingual and Cross-Lingual Research
Authors: Rrubaa Panchendrarajan, Arkaitz Zubiaga,
Abstract summary: We present state-of-the-art multilingual claim detection research categorized into three key factors of the problem, verifiability, priority, and similarity. We present a detailed overview of the existing multilingual datasets along with the challenges and suggest possible future advancements.
Score: 7.242609314791262
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automated fact-checking has drawn considerable attention over the past few decades due to the increase in the diffusion of misinformation on online platforms. This is often carried out as a sequence of tasks comprising (i) the detection of sentences circulating in online platforms which constitute claims needing verification, followed by (ii) the verification process of those claims. This survey focuses on the former, by discussing existing efforts towards detecting claims needing fact-checking, with a particular focus on multilingual data and methods. This is a challenging and fertile direction where existing methods are yet far from matching human performance due to the profoundly challenging nature of the issue. Especially, the dissemination of information across multiple social platforms, articulated in multiple languages and modalities demands more generalized solutions for combating misinformation. Focusing on multilingual misinformation, we present a comprehensive survey of existing multilingual claim detection research. We present state-of-the-art multilingual claim detection research categorized into three key factors of the problem, verifiability, priority, and similarity. Further, we present a detailed overview of the existing multilingual datasets along with the challenges and suggest possible future advancements.

Related papers

Entity-aware Cross-lingual Claim Detection for Automated Fact-checking [7.242609314791262]
We introduce EX-Claim, an entity-aware cross-lingual claim detection model that generalizes well to handle claims written in any language. Our proposed model significantly outperforms the baselines, across 27 languages, and achieves the highest rate of knowledge transfer, even with limited training data.
arXiv Detail & Related papers (2025-03-19T14:00:55Z)
Exploring the Potential Role of Generative AI in the TRAPD Procedure for Survey Translation [0.0]
This paper explores and assesses in what ways generative AI can assist in translating survey instruments. We implement a zero-shot prompt experiment using ChatGPT to explore generative AI's ability to identify features of questions that might be difficult to translate to a linguistic audience.
arXiv Detail & Related papers (2024-11-18T20:53:58Z)
Monolingual and Multilingual Misinformation Detection for Low-Resource Languages: A Comprehensive Survey [2.5459710368096586]
This survey provides a comprehensive overview of the current research on low-resource language misinformation detection. We review the existing datasets, methodologies, and tools used in these domains, identifying key challenges related to: data resources, model development, cultural and linguistic context, real-world applications, and research efforts. Our findings underscore the need for robust, inclusive systems capable of addressing misinformation across diverse linguistic and cultural contexts.
arXiv Detail & Related papers (2024-10-24T03:02:03Z)
A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers [51.8203871494146]
The rapid development of Large Language Models (LLMs) demonstrates remarkable multilingual capabilities in natural language processing. Despite the breakthroughs of LLMs, the investigation into the multilingual scenario remains insufficient. This survey aims to help the research community address multilingual problems and provide a comprehensive understanding of the core concepts, key techniques, and latest developments in multilingual natural language processing based on LLMs.
arXiv Detail & Related papers (2024-05-17T17:47:39Z)
Asking Multimodal Clarifying Questions in Mixed-Initiative Conversational Search [89.1772985740272]
In mixed-initiative conversational search systems, clarifying questions are used to help users who struggle to express their intentions in a single query. We hypothesize that in scenarios where multimodal information is pertinent, the clarification process can be improved by using non-textual information. We collect a dataset named Melon that contains over 4k multimodal clarifying questions, enriched with over 14k images. Several analyses are conducted to understand the importance of multimodal contents during the query clarification phase.
arXiv Detail & Related papers (2024-02-12T16:04:01Z)
Lost in Translation -- Multilingual Misinformation and its Evolution [52.07628580627591]
This paper investigates the prevalence and dynamics of multilingual misinformation through an analysis of over 250,000 unique fact-checks spanning 95 languages. We find that while the majority of misinformation claims are only fact-checked once, 11.7%, corresponding to more than 21,000 claims, are checked multiple times. Using fact-checks as a proxy for the spread of misinformation, we find 33% of repeated claims cross linguistic boundaries.
arXiv Detail & Related papers (2023-10-27T12:21:55Z)
Breaking Language Barriers with MMTweets: Advancing Cross-Lingual Debunked Narrative Retrieval for Fact-Checking [5.880794128275313]
Cross-lingual debunked narrative retrieval is an understudied problem. This study introduces cross-lingual debunked narrative retrieval and addresses this research gap by: (i) creating Multilingual Misinformation Tweets (MMTweets) MMTweets features cross-lingual pairs, images, human annotations, and fine-grained labels, making it a comprehensive resource compared to its counterparts. We find that MMTweets presents challenges for cross-lingual debunked narrative retrieval, highlighting areas for improvement in retrieval models.
arXiv Detail & Related papers (2023-08-10T16:33:17Z)
MINION: a Large-Scale and Diverse Dataset for Multilingual Event Detection [65.46122357928041]
Event Detection (ED) is the task of identifying and classifying trigger words of event mentions in text. Main questions include how well existing ED models perform on different languages, how challenging ED is in other languages, and how well ED knowledge and annotation can be transferred across languages. We introduce a new large-scale multilingual dataset for ED (called MINION) that consistently annotates events for 8 different languages.
arXiv Detail & Related papers (2022-11-11T02:09:51Z)
CONCRETE: Improving Cross-lingual Fact-checking with Cross-lingual Retrieval [73.48591773882052]
Most fact-checking approaches focus on English only due to the data scarcity issue in other languages. We present the first fact-checking framework augmented with crosslingual retrieval. We train the retriever with our proposed Crosslingual Inverse Cloze Task (XICT)
arXiv Detail & Related papers (2022-09-05T17:36:14Z)
Delving Deeper into Cross-lingual Visual Question Answering [115.16614806717341]
We show that simple modifications to the standard training setup can substantially reduce the transfer gap to monolingual English performance. We analyze cross-lingual VQA across different question types of varying complexity for different multilingual multimodal Transformers.
arXiv Detail & Related papers (2022-02-15T18:22:18Z)
Matching Tweets With Applicable Fact-Checks Across Languages [27.762055254009017]
We focus on automatically finding existing fact-checks for claims made in social media posts (tweets) We conduct both classification and retrieval experiments, in monolingual (English only), multilingual (Spanish, Portuguese), and cross-lingual (Hindi-English) settings. We present promising results for "match" classification (93% average accuracy) in four language pairs.
arXiv Detail & Related papers (2022-02-14T23:33:02Z)
Crossing the Conversational Chasm: A Primer on Multilingual Task-Oriented Dialogue Systems [51.328224222640614]
Current state-of-the-art ToD models based on large pretrained neural language models are data hungry. Data acquisition for ToD use cases is expensive and tedious.
arXiv Detail & Related papers (2021-04-17T15:19:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.