CAVES: A Dataset to facilitate Explainable Classification and
Summarization of Concerns towards COVID Vaccines
- URL: http://arxiv.org/abs/2204.13746v2
- Date: Fri, 11 Nov 2022 14:16:46 GMT
- Title: CAVES: A Dataset to facilitate Explainable Classification and
Summarization of Concerns towards COVID Vaccines
- Authors: Soham Poddar, Azlaan Mustafa Samad, Rajdeep Mukherjee, Niloy Ganguly,
Saptarshi Ghosh
- Abstract summary: We have curated CAVES, the first large-scale dataset containing about 10k COVID-19 anti-vaccine tweets labelled into various specific anti-vaccine concerns.
This is also the first multi-label classification dataset that provides explanations for each of the labels.
- Score: 18.617543658780367
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Convincing people to get vaccinated against COVID-19 is a key societal
challenge in the present times. As a first step towards this goal, many prior
works have relied on social media analysis to understand the specific concerns
that people have towards these vaccines, such as potential side-effects,
ineffectiveness, political factors, and so on. Though there are datasets that
broadly classify social media posts into Anti-vax and Pro-Vax labels, there is
no dataset (to our knowledge) that labels social media posts according to the
specific anti-vaccine concerns mentioned in the posts. In this paper, we have
curated CAVES, the first large-scale dataset containing about 10k COVID-19
anti-vaccine tweets labelled into various specific anti-vaccine concerns in a
multi-label setting. This is also the first multi-label classification dataset
that provides explanations for each of the labels. Additionally, the dataset
also provides class-wise summaries of all the tweets. We also perform
preliminary experiments on the dataset and show that this is a very challenging
dataset for multi-label explainable classification and tweet summarization, as
is evident by the moderate scores achieved by some state-of-the-art models. Our
dataset and codes are available at: https://github.com/sohampoddar26/caves-data
Related papers
- Analyzing COVID-19 Vaccination Sentiments in Nigerian Cyberspace:
Insights from a Manually Annotated Twitter Dataset [2.820717448579396]
We explore the use of transformer-based language models to study people's acceptance of vaccines in Nigeria.
We developed a novel dataset by crawling multi-lingual tweets using relevant hashtags and keywords.
Our analysis and visualizations revealed that most tweets expressed neutral sentiments about COVID-19 vaccines, with some individuals expressing positive views.
arXiv Detail & Related papers (2024-01-23T22:49:19Z) - Into the LAIONs Den: Investigating Hate in Multimodal Datasets [67.21783778038645]
This paper investigates the effect of scaling datasets on hateful content through a comparative audit of two datasets: LAION-400M and LAION-2B.
We found that hate content increased by nearly 12% with dataset scale, measured both qualitatively and quantitatively.
We also found that filtering dataset contents based on Not Safe For Work (NSFW) values calculated based on images alone does not exclude all the harmful content in alt-text.
arXiv Detail & Related papers (2023-11-06T19:00:05Z) - Weakly-supervised positional contrastive learning: application to
cirrhosis classification [45.63061034568991]
Large medical imaging datasets can be cheaply annotated with low-confidence, weak labels.
Access to high-confidence labels, such as histology-based diagnoses, is rare and costly.
We propose an efficient weakly-supervised positional (WSP) contrastive learning strategy.
arXiv Detail & Related papers (2023-07-10T15:02:13Z) - Vax-Culture: A Dataset for Studying Vaccine Discourse on Twitter [3.768191396638854]
Vaccine hesitancy continues to be a main challenge for public health officials during the COVID-19 pandemic.
We present Vax-Culture, a novel Twitter COVID-19 dataset consisting of 6373 vaccine-related tweets.
We hope this can lead to effective and targeted public health communication strategies for reaching individuals with anti-vaccine beliefs.
arXiv Detail & Related papers (2023-04-13T23:04:30Z) - Doctors vs. Nurses: Understanding the Great Divide in Vaccine Hesitancy
among Healthcare Workers [64.1526243118151]
We find that doctors are overall more positive toward the COVID-19 vaccines.
Doctors are more concerned with the effectiveness of the vaccines over newer variants.
Nurses pay more attention to the potential side effects on children.
arXiv Detail & Related papers (2022-09-11T14:22:16Z) - CoVaxNet: An Online-Offline Data Repository for COVID-19 Vaccine
Hesitancy Research [39.82073461647643]
A substantial proportion of the population is still hesitant to be vaccinated against the COVID-19 virus.
Existing datasets fail to cover all these aspects, making it difficult to form a complete picture in inferencing about the problem of vaccine hesitancy.
In this paper, we construct a multi-source, multi-modal, and multi-feature online-offline data repository CoVaxNet.
arXiv Detail & Related papers (2022-06-30T05:58:35Z) - Disentangled Learning of Stance and Aspect Topics for Vaccine Attitude
Detection in Social Media [40.61499595293957]
We propose a novel semi-supervised approach for vaccine attitude detection, called VADet.
VADet is able to learn disentangled stance and aspect topics, and outperforms existing aspect-based sentiment analysis models on both stance detection and tweet clustering.
arXiv Detail & Related papers (2022-05-06T15:24:33Z) - ArCovidVac: Analyzing Arabic Tweets About COVID-19 Vaccination [7.594204373985492]
We release the first largest manually annotated Arabic tweet dataset, ArCovidVac, for the COVID-19 vaccination campaign.
The dataset is enriched with different layers of annotation, including, (i) Informativeness (more vs. less importance of the tweets); (ii) fine-grained tweet content types (e.g., advice, rumors, restriction, authenticate news/information); and (iii) stance towards vaccination.
arXiv Detail & Related papers (2022-01-17T16:19:21Z) - Classifying vaccine sentiment tweets by modelling domain-specific
representation and commonsense knowledge into context-aware attentive GRU [9.8215089151757]
Vaccine hesitancy and refusal can create clusters of low vaccine coverage and reduce the effectiveness of vaccination programs.
Social media provides an opportunity to estimate emerging risks to vaccine acceptance by including geographical location and detailing vaccine-related concerns.
Methods for classifying social media posts, such as vaccine-related tweets, use language models (LMs) trained on general domain text.
We present a novel end-to-end framework consisting of interconnected components that use domain-specific LM trained on vaccine-related tweets and models commonsense knowledge into a bidirectional gated recurrent network (CK-BiGRU) with context-aware attention.
arXiv Detail & Related papers (2021-06-17T15:16:08Z) - COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter
Dataset of Anti-vaccine Content, Vaccine Misinformation and Conspiracies [10.505633521103018]
False claims about COVID-19 vaccines can undermine public trust in ongoing vaccination campaigns.
We present a dataset of Twitter posts that exhibit a strong anti-vaccine stance.
arXiv Detail & Related papers (2021-05-11T15:43:41Z) - Falling into the Echo Chamber: the Italian Vaccination Debate on Twitter [65.7192861893042]
We examine the extent to which the vaccination debate on Twitter is conductive to potential outreach to the vaccination hesitant.
We discover that the vaccination skeptics, as well as the advocates, reside in their own distinct "echo chambers"
At the center of these echo chambers we find the ardent supporters, for which we build highly accurate network- and content-based classifiers.
arXiv Detail & Related papers (2020-03-26T13:55:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.