A Large-Scale Comparative Study of Accurate COVID-19 Information versus
Misinformation
- URL: http://arxiv.org/abs/2304.04811v2
- Date: Sun, 7 May 2023 13:55:02 GMT
- Title: A Large-Scale Comparative Study of Accurate COVID-19 Information versus
Misinformation
- Authors: Yida Mu, Ye Jiang, Freddy Heppell, Iknoor Singh, Carolina Scarton,
Kalina Bontcheva, Xingyi Song
- Abstract summary: The COVID-19 pandemic led to an infodemic where an overwhelming amount of COVID-19 related content was being disseminated at high velocity through social media.
This motivated us to carry out a comparative study of the characteristics of COVID-19 misinformation versus those of accurate COVID-19 information through a large-scale computational analysis of over 242 million tweets.
An added contribution of this study is the creation of a COVID-19 misinformation classification dataset.
- Score: 4.926199465135915
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: The COVID-19 pandemic led to an infodemic where an overwhelming amount of
COVID-19 related content was being disseminated at high velocity through social
media. This made it challenging for citizens to differentiate between accurate
and inaccurate information about COVID-19. This motivated us to carry out a
comparative study of the characteristics of COVID-19 misinformation versus
those of accurate COVID-19 information through a large-scale computational
analysis of over 242 million tweets. The study makes comparisons alongside four
key aspects: 1) the distribution of topics, 2) the live status of tweets, 3)
language analysis and 4) the spreading power over time. An added contribution
of this study is the creation of a COVID-19 misinformation classification
dataset. Finally, we demonstrate that this new dataset helps improve
misinformation classification by more than 9\% based on average F1 measure.
Related papers
- AMIR: Automated MisInformation Rebuttal -- A COVID-19 Vaccination Datasets based Recommendation System [0.05461938536945722]
This work explored how existing information obtained from social media can be harnessed to facilitate automated rebuttal of misinformation at scale.
It leverages two publicly available datasets, FaCov (fact-checked articles) and misleading (social media Twitter) data on COVID-19 Vaccination.
arXiv Detail & Related papers (2023-10-29T13:07:33Z) - Two-Stage Classifier for COVID-19 Misinformation Detection Using BERT: a
Study on Indonesian Tweets [0.15229257192293202]
Research on COVID-19 misinformation detection in Indonesia is still scarce.
In this study, we propose the two-stage classifier model using IndoBERT pre-trained language model for the Tweet misinformation detection task.
The experimental results show that the combination of the BERT sequence classifier for relevance prediction and Bi-LSTM for misinformation detection outperformed other machine learning models with an accuracy of 87.02%.
arXiv Detail & Related papers (2022-06-30T15:33:20Z) - "COVID-19 was a FIFA conspiracy #curropt": An Investigation into the
Viral Spread of COVID-19 Misinformation [60.268682953952506]
We estimate the extent to which misinformation has influenced the course of the COVID-19 pandemic using natural language processing models.
We provide a strategy to combat social media posts that are likely to cause widespread harm.
arXiv Detail & Related papers (2022-06-12T19:41:01Z) - Testing the Generalization of Neural Language Models for COVID-19
Misinformation Detection [6.1204874238049705]
A drastic rise in potentially life-threatening misinformation has been a by-product of the COVID-19 pandemic.
We evaluate fifteen Transformer-based models on five COVID-19 misinformation datasets.
We show tokenizers and models tailored to COVID-19 data do not provide a significant advantage over general-purpose ones.
arXiv Detail & Related papers (2021-11-15T15:01:55Z) - Categorising Fine-to-Coarse Grained Misinformation: An Empirical Study
of COVID-19 Infodemic [6.137022734902771]
We introduce a fine-grained annotated misinformation tweets dataset including social behaviours annotation.
The dataset not only allows social behaviours analysis but also suitable for both evidence-based or non-evidence-based misinformation classification task.
arXiv Detail & Related papers (2021-06-22T12:17:53Z) - Understanding the temporal evolution of COVID-19 research through
machine learning and natural language processing [66.63200823918429]
The outbreak of the novel coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been continuously affecting human lives and communities around the world.
We used multiple data sources, i.e., PubMed and ArXiv, and built several machine learning models to characterize the landscape of current COVID-19 research.
Our findings confirm the types of research available in PubMed and ArXiv differ significantly, with the former exhibiting greater diversity in terms of COVID-19 related issues.
arXiv Detail & Related papers (2020-07-22T18:02:39Z) - CO-Search: COVID-19 Information Retrieval with Semantic Search, Question
Answering, and Abstractive Summarization [53.67205506042232]
CO-Search is a retriever-ranker semantic search engine designed to handle complex queries over the COVID-19 literature.
To account for the domain-specific and relatively limited dataset, we generate a bipartite graph of document paragraphs and citations.
We evaluate our system on the data of the TREC-COVID information retrieval challenge.
arXiv Detail & Related papers (2020-06-17T01:32:48Z) - Misinformation Has High Perplexity [55.47422012881148]
We propose to leverage the perplexity to debunk false claims in an unsupervised manner.
First, we extract reliable evidence from scientific and news sources according to sentence similarity to the claims.
Second, we prime a language model with the extracted evidence and finally evaluate the correctness of given claims based on the perplexity scores at debunking time.
arXiv Detail & Related papers (2020-06-08T15:13:44Z) - Classification Aware Neural Topic Model and its Application on a New
COVID-19 Disinformation Corpus [2.492887522265771]
The explosion of disinformation following the COVID-19 pandemic has overloaded fact-checkers and media worldwide.
To help tackle this, we developed computational methods to categorise COVID-19 disinformation.
arXiv Detail & Related papers (2020-06-05T10:32:18Z) - COVID-DA: Deep Domain Adaptation from Typical Pneumonia to COVID-19 [92.4955073477381]
The outbreak of novel coronavirus disease 2019 (COVID-19) has already infected millions of people and is still rapidly spreading all over the globe.
Deep learning has been used recently as effective computer-aided means to improve diagnostic efficiency.
We propose a new deep domain adaptation method for COVID-19 diagnosis, namely COVID-DA.
arXiv Detail & Related papers (2020-04-30T03:13:40Z) - Rapidly Bootstrapping a Question Answering Dataset for COVID-19 [88.86456834766288]
We present CovidQA, the beginnings of a question answering dataset specifically designed for COVID-19.
This is the first publicly available resource of its type, and intended as a stopgap measure for guiding research until more substantial evaluation resources become available.
arXiv Detail & Related papers (2020-04-23T17:35:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.