LAMBRETTA: Learning to Rank for Twitter Soft Moderation
- URL: http://arxiv.org/abs/2212.05926v1
- Date: Mon, 12 Dec 2022 14:41:46 GMT
- Title: LAMBRETTA: Learning to Rank for Twitter Soft Moderation
- Authors: Pujan Paudel, Jeremy Blackburn, Emiliano De Cristofaro, Savvas
Zannettou, and Gianluca Stringhini
- Abstract summary: LAMBRETTA is a system that automatically identifies tweets that are candidates for soft moderation.
We run LAMBRETTA on Twitter data to moderate false claims related to the 2020 US Election.
It flags over 20 times more tweets than Twitter, with only 3.93% false positives and 18.81% false negatives.
- Score: 11.319938541673578
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To curb the problem of false information, social media platforms like Twitter
started adding warning labels to content discussing debunked narratives, with
the goal of providing more context to their audiences. Unfortunately, these
labels are not applied uniformly and leave large amounts of false content
unmoderated. This paper presents LAMBRETTA, a system that automatically
identifies tweets that are candidates for soft moderation using Learning To
Rank (LTR). We run LAMBRETTA on Twitter data to moderate false claims related
to the 2020 US Election and find that it flags over 20 times more tweets than
Twitter, with only 3.93% false positives and 18.81% false negatives,
outperforming alternative state-of-the-art methods based on keyword extraction
and semantic search. Overall, LAMBRETTA assists human moderators in identifying
and flagging false information on social media.
Related papers
- Fake News in Sheep's Clothing: Robust Fake News Detection Against LLM-Empowered Style Attacks [60.14025705964573]
SheepDog is a style-robust fake news detector that prioritizes content over style in determining news veracity.
SheepDog achieves this resilience through (1) LLM-empowered news reframings that inject style diversity into the training process by customizing articles to match different styles; (2) a style-agnostic training scheme that ensures consistent veracity predictions across style-diverse reframings; and (3) content-focused attributions that distill content-centric guidelines from LLMs for debunking fake news.
arXiv Detail & Related papers (2023-10-16T21:05:12Z) - Russo-Ukrainian War: Prediction and explanation of Twitter suspension [47.61306219245444]
This study focuses on the Twitter suspension mechanism and the analysis of shared content and features of user accounts that may lead to this.
We have obtained a dataset containing 107.7M tweets, originating from 9.8 million users, using Twitter API.
Our results reveal scam campaigns taking advantage of trending topics regarding the Russia-Ukrainian conflict for Bitcoin fraud, spam, and advertisement campaigns.
arXiv Detail & Related papers (2023-06-06T08:41:02Z) - Retweet-BERT: Political Leaning Detection Using Language Features and
Information Diffusion on Social Networks [30.143148646797265]
We introduce Retweet-BERT, a simple and scalable model to estimate the political leanings of Twitter users.
Our assumptions stem from patterns of networks and linguistics homophily among people who share similar ideologies.
arXiv Detail & Related papers (2022-07-18T02:18:20Z) - Manipulating Twitter Through Deletions [64.33261764633504]
Research into influence campaigns on Twitter has mostly relied on identifying malicious activities from tweets obtained via public APIs.
Here, we provide the first exhaustive, large-scale analysis of anomalous deletion patterns involving more than a billion deletions by over 11 million accounts.
We find that a small fraction of accounts delete a large number of tweets daily.
First, limits on tweet volume are circumvented, allowing certain accounts to flood the network with over 26 thousand daily tweets.
Second, coordinated networks of accounts engage in repetitive likes and unlikes of content that is eventually deleted, which can manipulate ranking algorithms.
arXiv Detail & Related papers (2022-03-25T20:07:08Z) - Identification of Twitter Bots based on an Explainable ML Framework: the
US 2020 Elections Case Study [72.61531092316092]
This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data.
Supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm.
Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions.
arXiv Detail & Related papers (2021-12-08T14:12:24Z) - BERT based classification system for detecting rumours on Twitter [3.2872586139884623]
We propose a novel approach to identify rumours on Twitter, rather than the usual feature extraction techniques.
We use sentence embedding using BERT to represent each tweet's sentences into a vector according to the contextual meaning of the tweet.
Our BERT based models improved the accuracy by approximately 10% as compared to previous methods.
arXiv Detail & Related papers (2021-09-07T10:15:54Z) - News consumption and social media regulations policy [70.31753171707005]
We analyze two social media that enforced opposite moderation methods, Twitter and Gab, to assess the interplay between news consumption and content regulation.
Our results show that the presence of moderation pursued by Twitter produces a significant reduction of questionable content.
The lack of clear regulation on Gab results in the tendency of the user to engage with both types of content, showing a slight preference for the questionable ones which may account for a dissing/endorsement behavior.
arXiv Detail & Related papers (2021-06-07T19:26:32Z) - "I Won the Election!": An Empirical Analysis of Soft Moderation
Interventions on Twitter [0.9391375268580806]
We study the users who share tweets with warning labels on Twitter and their political leaning.
We find that 72% of the tweets with warning labels are shared by Republicans, while only 11% are shared by Democrats.
arXiv Detail & Related papers (2021-01-18T17:39:58Z) - Predicting Misinformation and Engagement in COVID-19 Twitter Discourse
in the First Months of the Outbreak [1.2059055685264957]
We examine nearly 505K COVID-19-related tweets from the initial months of the pandemic to understand misinformation as a function of bot-behavior and engagement.
We found that real users tweet both facts and misinformation, while bots tweet proportionally more misinformation.
arXiv Detail & Related papers (2020-12-03T18:47:34Z) - Russian trolls speaking Russian: Regional Twitter operations and MH17 [68.8204255655161]
In 2018, Twitter released data on accounts identified as Russian trolls.
We analyze the Russian-language operations of these trolls.
We find that trolls' information campaign on the MH17 crash was the largest in terms of tweet count.
arXiv Detail & Related papers (2020-05-13T19:48:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.