Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections
- URL: http://arxiv.org/abs/2404.00141v1
- Date: Fri, 29 Mar 2024 20:29:12 GMT
- Title: Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections
- Authors: Ahmad Diab, Rr. Nefriana, Yu-Ru Lin,
- Abstract summary: This work establishes a general scheme for classifying discussions related to conspiracy theories.
We leverage human-labeled ground truth to train a BERT-based model for classifying online CTs.
We present the first large-scale classification study using posts from the most active conspiracy-related Reddit forums.
- Score: 4.594855794205588
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Online discussions frequently involve conspiracy theories, which can contribute to the proliferation of belief in them. However, not all discussions surrounding conspiracy theories promote them, as some are intended to debunk them. Existing research has relied on simple proxies or focused on a constrained set of signals to identify conspiracy theories, which limits our understanding of conspiratorial discussions across different topics and online communities. This work establishes a general scheme for classifying discussions related to conspiracy theories based on authors' perspectives on the conspiracy belief, which can be expressed explicitly through narrative elements, such as the agent, action, or objective, or implicitly through references to known theories, such as chemtrails or the New World Order. We leverage human-labeled ground truth to train a BERT-based model for classifying online CTs, which we then compared to the Generative Pre-trained Transformer machine (GPT) for detecting online conspiratorial content. Despite GPT's known strengths in its expressiveness and contextual understanding, our study revealed significant flaws in its logical reasoning, while also demonstrating comparable strengths from our classifiers. We present the first large-scale classification study using posts from the most active conspiracy-related Reddit forums and find that only one-third of the posts are classified as positive. This research sheds light on the potential applications of large language models in tasks demanding nuanced contextual comprehension.
Related papers
- Unveiling Online Conspiracy Theorists: a Text-Based Approach and Characterization [42.242551342068374]
We conducted a comprehensive analysis of two distinct X datasets: one comprising users with conspiracy theorizing patterns and another made of users lacking such tendencies.
Our findings reveal marked differences in the lexicon and language adopted by conspiracy theorists with respect to other users.
We developed a machine learning classifier capable of identifying users who propagate conspiracy theories based on a rich set of 871 features.
arXiv Detail & Related papers (2024-05-21T08:07:38Z) - ACTI at EVALITA 2023: Overview of the Conspiracy Theory Identification
Task [7.36947519345126]
The ACTI challenge, based exclusively on comments published on conspiratorial channels of telegram, is divided into two subtasks.
A total of fifteen teams participated in the task for a total of 81 submissions.
We illustrate the best performing approaches were based on the utilization of large language models.
arXiv Detail & Related papers (2023-07-12T20:33:30Z) - Pathways through Conspiracy: The Evolution of Conspiracy Radicalization
through Engagement in Online Conspiracy Discussions [9.410583483182657]
This paper provides the empirical modeling of various radicalization phases in online conspiracy theory discussion participants.
By studying 36K users through their 169M contributions, we uncover four distinct pathways of conspiracy engagement.
Specific sub-populations of users, namely those on steady high and increasing conspiracy engagement pathways, progress successively through various radicalization stages.
arXiv Detail & Related papers (2022-04-22T14:31:53Z) - Where the Earth is flat and 9/11 is an inside job: A comparative
algorithm audit of conspiratorial information in web search results [62.997667081978825]
We examine the distribution of conspiratorial information in search results across five search engines: Google, Bing, DuckDuckGo, Yahoo and Yandex.
We find that all search engines except Google consistently displayed conspiracy-promoting results and returned links to conspiracy-dedicated websites in their top results.
Most conspiracy-promoting results came from social media and conspiracy-dedicated websites while conspiracy-debunking information was shared by scientific websites and, to a lesser extent, legacy media.
arXiv Detail & Related papers (2021-12-02T14:29:21Z) - Fact-driven Logical Reasoning for Machine Reading Comprehension [82.58857437343974]
We are motivated to cover both commonsense and temporary knowledge clues hierarchically.
Specifically, we propose a general formalism of knowledge units by extracting backbone constituents of the sentence.
We then construct a supergraph on top of the fact units, allowing for the benefit of sentence-level (relations among fact groups) and entity-level interactions.
arXiv Detail & Related papers (2021-05-21T13:11:13Z) - The Truth is Out There: Investigating Conspiracy Theories in Text
Generation [66.01545519772527]
We investigate the propensity for language models to generate conspiracy theory text.
Our study focuses on testing these models for the elicitation of conspiracy theories.
We introduce a new dataset consisting of conspiracy theory topics, machine-generated conspiracy theories, and human-written conspiracy theories.
arXiv Detail & Related papers (2021-01-02T05:47:39Z) - Paragraph-level Commonsense Transformers with Recurrent Memory [77.4133779538797]
We train a discourse-aware model that incorporates paragraph-level information to generate coherent commonsense inferences from narratives.
Our results show that PARA-COMET outperforms the sentence-level baselines, particularly in generating inferences that are both coherent and novel.
arXiv Detail & Related papers (2020-10-04T05:24:12Z) - An automated pipeline for the discovery of conspiracy and conspiracy
theory narrative frameworks: Bridgegate, Pizzagate and storytelling on the
web [0.0]
We present an automated pipeline for the discovery and description of the generative narrative frameworks of conspiracy theories on social media.
We base this work on two separate repositories of posts and news articles describing the well-known conspiracy theory Pizzagate from 2016.
We show how the Pizzagate framework relies on the conspiracy theorists' interpretation of "hidden knowledge" to link otherwise unlinked domains of human interaction.
arXiv Detail & Related papers (2020-08-23T05:14:38Z) - Misinformation Has High Perplexity [55.47422012881148]
We propose to leverage the perplexity to debunk false claims in an unsupervised manner.
First, we extract reliable evidence from scientific and news sources according to sentence similarity to the claims.
Second, we prime a language model with the extracted evidence and finally evaluate the correctness of given claims based on the perplexity scores at debunking time.
arXiv Detail & Related papers (2020-06-08T15:13:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.