Vax-Culture: A Dataset for Studying Vaccine Discourse on Twitter
- URL: http://arxiv.org/abs/2304.06858v3
- Date: Sun, 11 Jun 2023 22:11:10 GMT
- Title: Vax-Culture: A Dataset for Studying Vaccine Discourse on Twitter
- Authors: Mohammad Reza Zarei, Michael Christensen, Sarah Everts and Majid
Komeili
- Abstract summary: Vaccine hesitancy continues to be a main challenge for public health officials during the COVID-19 pandemic.
We present Vax-Culture, a novel Twitter COVID-19 dataset consisting of 6373 vaccine-related tweets.
We hope this can lead to effective and targeted public health communication strategies for reaching individuals with anti-vaccine beliefs.
- Score: 3.768191396638854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vaccine hesitancy continues to be a main challenge for public health
officials during the COVID-19 pandemic. As this hesitancy undermines vaccine
campaigns, many researchers have sought to identify its root causes, finding
that the increasing volume of anti-vaccine misinformation on social media
platforms is a key element of this problem. We explored Twitter as a source of
misleading content with the goal of extracting overlapping cultural and
political beliefs that motivate the spread of vaccine misinformation. To do
this, we have collected a data set of vaccine-related Tweets and annotated them
with the help of a team of annotators with a background in communications and
journalism. Ultimately we hope this can lead to effective and targeted public
health communication strategies for reaching individuals with anti-vaccine
beliefs. Moreover, this information helps with developing Machine Learning
models to automatically detect vaccine misinformation posts and combat their
negative impacts. In this paper, we present Vax-Culture, a novel Twitter
COVID-19 dataset consisting of 6373 vaccine-related tweets accompanied by an
extensive set of human-provided annotations including vaccine-hesitancy stance,
indication of any misinformation in tweets, the entities criticized and
supported in each tweet and the communicated message of each tweet. Moreover,
we define five baseline tasks including four classification and one sequence
generation tasks, and report the results of a set of recent transformer-based
models for them. The dataset and code are publicly available at
https://github.com/mrzarei5/Vax-Culture.
Related papers
- SPEED++: A Multilingual Event Extraction Framework for Epidemic Prediction and Preparedness [73.73883111570458]
We introduce the first multilingual Event Extraction framework for extracting epidemic event information for a wide range of diseases and languages.
Annotating data in every language is infeasible; thus we develop zero-shot cross-lingual cross-disease models.
Our framework can provide epidemic warnings for COVID-19 in its earliest stages in Dec 2019 from Chinese Weibo posts without any training in Chinese.
arXiv Detail & Related papers (2024-10-24T03:03:54Z) - Doctors vs. Nurses: Understanding the Great Divide in Vaccine Hesitancy
among Healthcare Workers [64.1526243118151]
We find that doctors are overall more positive toward the COVID-19 vaccines.
Doctors are more concerned with the effectiveness of the vaccines over newer variants.
Nurses pay more attention to the potential side effects on children.
arXiv Detail & Related papers (2022-09-11T14:22:16Z) - CoVaxNet: An Online-Offline Data Repository for COVID-19 Vaccine
Hesitancy Research [39.82073461647643]
A substantial proportion of the population is still hesitant to be vaccinated against the COVID-19 virus.
Existing datasets fail to cover all these aspects, making it difficult to form a complete picture in inferencing about the problem of vaccine hesitancy.
In this paper, we construct a multi-source, multi-modal, and multi-feature online-offline data repository CoVaxNet.
arXiv Detail & Related papers (2022-06-30T05:58:35Z) - "COVID-19 was a FIFA conspiracy #curropt": An Investigation into the
Viral Spread of COVID-19 Misinformation [60.268682953952506]
We estimate the extent to which misinformation has influenced the course of the COVID-19 pandemic using natural language processing models.
We provide a strategy to combat social media posts that are likely to cause widespread harm.
arXiv Detail & Related papers (2022-06-12T19:41:01Z) - CAVES: A Dataset to facilitate Explainable Classification and
Summarization of Concerns towards COVID Vaccines [18.617543658780367]
We have curated CAVES, the first large-scale dataset containing about 10k COVID-19 anti-vaccine tweets labelled into various specific anti-vaccine concerns.
This is also the first multi-label classification dataset that provides explanations for each of the labels.
arXiv Detail & Related papers (2022-04-28T19:26:54Z) - ArCovidVac: Analyzing Arabic Tweets About COVID-19 Vaccination [7.594204373985492]
We release the first largest manually annotated Arabic tweet dataset, ArCovidVac, for the COVID-19 vaccination campaign.
The dataset is enriched with different layers of annotation, including, (i) Informativeness (more vs. less importance of the tweets); (ii) fine-grained tweet content types (e.g., advice, rumors, restriction, authenticate news/information); and (iii) stance towards vaccination.
arXiv Detail & Related papers (2022-01-17T16:19:21Z) - A Python Package to Detect Anti-Vaccine Users on Twitter [1.1602089225841632]
Anti-vaccine hesitancy has been recently driven by the anti-vaccine narratives shared online.
We introduce a Python package capable of analyzing Twitter profiles to assess how likely that profile is to spread anti-vaccine sentiment.
We leverage the data on such users to understand what are the moral and emotional characteristics of anti-vaccine spreaders.
arXiv Detail & Related papers (2021-10-21T17:59:25Z) - Cross-lingual COVID-19 Fake News Detection [54.125563009333995]
We make the first attempt to detect COVID-19 misinformation in a low-resource language (Chinese) only using the fact-checked news in a high-resource language (English)
We propose a deep learning framework named CrossFake to jointly encode the cross-lingual news body texts and capture the news content.
Empirical results on our dataset demonstrate the effectiveness of CrossFake under the cross-lingual setting.
arXiv Detail & Related papers (2021-10-13T04:44:02Z) - Classifying vaccine sentiment tweets by modelling domain-specific
representation and commonsense knowledge into context-aware attentive GRU [9.8215089151757]
Vaccine hesitancy and refusal can create clusters of low vaccine coverage and reduce the effectiveness of vaccination programs.
Social media provides an opportunity to estimate emerging risks to vaccine acceptance by including geographical location and detailing vaccine-related concerns.
Methods for classifying social media posts, such as vaccine-related tweets, use language models (LMs) trained on general domain text.
We present a novel end-to-end framework consisting of interconnected components that use domain-specific LM trained on vaccine-related tweets and models commonsense knowledge into a bidirectional gated recurrent network (CK-BiGRU) with context-aware attention.
arXiv Detail & Related papers (2021-06-17T15:16:08Z) - COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter
Dataset of Anti-vaccine Content, Vaccine Misinformation and Conspiracies [10.505633521103018]
False claims about COVID-19 vaccines can undermine public trust in ongoing vaccination campaigns.
We present a dataset of Twitter posts that exhibit a strong anti-vaccine stance.
arXiv Detail & Related papers (2021-05-11T15:43:41Z) - Falling into the Echo Chamber: the Italian Vaccination Debate on Twitter [65.7192861893042]
We examine the extent to which the vaccination debate on Twitter is conductive to potential outreach to the vaccination hesitant.
We discover that the vaccination skeptics, as well as the advocates, reside in their own distinct "echo chambers"
At the center of these echo chambers we find the ardent supporters, for which we build highly accurate network- and content-based classifiers.
arXiv Detail & Related papers (2020-03-26T13:55:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.