Misleading the Covid-19 vaccination discourse on Twitter: An exploratory
study of infodemic around the pandemic
- URL: http://arxiv.org/abs/2108.10735v1
- Date: Mon, 16 Aug 2021 17:02:18 GMT
- Title: Misleading the Covid-19 vaccination discourse on Twitter: An exploratory
study of infodemic around the pandemic
- Authors: Shakshi Sharma, Rajesh Sharma, and Anwitaman Datta
- Abstract summary: We collect a moderate-sized representative corpus of tweets (200,000 approx.) pertaining to Covid-19 vaccination over a period of seven months (September 2020 - March 2021)
Following a Transfer Learning approach, we utilize the pre-trained Transformer-based XLNet model to classify tweets as Misleading or Non-Misleading.
We build on this to study and contrast the characteristics of tweets in the corpus that are misleading in nature against non-misleading ones.
Several ML models are employed for prediction, with up to 90% accuracy, and the importance of each feature is explained using SHAP Explainable AI (X
- Score: 0.45593531937154413
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we collect a moderate-sized representative corpus of tweets
(200,000 approx.) pertaining Covid-19 vaccination spanning over a period of
seven months (September 2020 - March 2021). Following a Transfer Learning
approach, we utilize the pre-trained Transformer-based XLNet model to classify
tweets as Misleading or Non-Misleading and validate against a random subset of
results manually. We build on this to study and contrast the characteristics of
tweets in the corpus that are misleading in nature against non-misleading ones.
This exploratory analysis enables us to design features (such as sentiments,
hashtags, nouns, pronouns, etc) that can, in turn, be exploited for classifying
tweets as (Non-)Misleading using various ML models in an explainable manner.
Specifically, several ML models are employed for prediction, with up to 90%
accuracy, and the importance of each feature is explained using SHAP
Explainable AI (XAI) tool. While the thrust of this work is principally
exploratory analysis in order to obtain insights on the online discourse on
Covid-19 vaccination, we conclude the paper by outlining how these insights
provide the foundations for a more actionable approach to mitigate
misinformation. The curated dataset and code is made available (Github
repository) so that the research community at large can reproduce, compare
against, or build upon this work.
Related papers
- Understanding writing style in social media with a supervised
contrastively pre-trained transformer [57.48690310135374]
Online Social Networks serve as fertile ground for harmful behavior, ranging from hate speech to the dissemination of disinformation.
We introduce the Style Transformer for Authorship Representations (STAR), trained on a large corpus derived from public sources of 4.5 x 106 authored texts.
Using a support base of 8 documents of 512 tokens, we can discern authors from sets of up to 1616 authors with at least 80% accuracy.
arXiv Detail & Related papers (2023-10-17T09:01:17Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media [74.93847489218008]
We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.
To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.
Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
arXiv Detail & Related papers (2023-05-23T16:40:07Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - 5q032e@SMM4H'22: Transformer-based classification of premise in tweets
related to COVID-19 [2.3931689873603603]
We propose a predictive model based on transformer architecture to classify the presence of premise in Twitter texts.
Our experiments on a Twitter dataset showed that RoBERTa is superior to the other transformer models in the case of the premise prediction task.
arXiv Detail & Related papers (2022-09-08T14:46:28Z) - Machine Learning-based Automatic Annotation and Detection of COVID-19
Fake News [8.020736472947581]
COVID-19 impacted every part of the world, although the misinformation about the outbreak traveled faster than the virus.
Existing work neglects the presence of bots that act as a catalyst in the spread.
We propose an automated approach for labeling data using verified fact-checked statements on a Twitter dataset.
arXiv Detail & Related papers (2022-09-07T13:55:59Z) - Rumor Detection with Self-supervised Learning on Texts and Social Graph [101.94546286960642]
We propose contrastive self-supervised learning on heterogeneous information sources, so as to reveal their relations and characterize rumors better.
We term this framework as Self-supervised Rumor Detection (SRD)
Extensive experiments on three real-world datasets validate the effectiveness of SRD for automatic rumor detection on social media.
arXiv Detail & Related papers (2022-04-19T12:10:03Z) - COVID-19 Tweets Analysis through Transformer Language Models [0.0]
In this study, we perform an in-depth, fine-grained sentiment analysis of tweets in COVID-19.
A trained transformer model is able to correctly predict, with high accuracy, the tone of a tweet.
We then leverage this model for predicting tones for 200,000 tweets on COVID-19.
arXiv Detail & Related papers (2021-02-27T12:06:33Z) - Team Alex at CLEF CheckThat! 2020: Identifying Check-Worthy Tweets With
Transformer Models [28.25006244616817]
We propose a model for detecting check-worthy tweets about COVID-19, which combines deep contextualized text representations with modeling the social context of the tweet.
Our official submission to the English version of CLEF-2020 CheckThat! Task 1, system Team_Alex, was ranked second with a MAP score of 0.8034.
arXiv Detail & Related papers (2020-09-07T08:03:21Z) - BANANA at WNUT-2020 Task 2: Identifying COVID-19 Information on Twitter
by Combining Deep Learning and Transfer Learning Models [0.0]
This paper describes our prediction system for WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets.
The dataset for this task contains size 10,000 tweets in English labeled by humans.
The experimental result indicates that we have achieved F1 for the INFORMATIVE label on our systems at 88.81% on the test set.
arXiv Detail & Related papers (2020-09-06T08:24:55Z) - Cross-lingual Transfer Learning for COVID-19 Outbreak Alignment [90.12602012910465]
We train on Italy's early COVID-19 outbreak through Twitter and transfer to several other countries.
Our experiments show strong results with up to 0.85 Spearman correlation in cross-country predictions.
arXiv Detail & Related papers (2020-06-05T02:04:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.