Leveraging Large Language Models and Weak Supervision for Social Media
data annotation: an evaluation using COVID-19 self-reported vaccination
tweets
- URL: http://arxiv.org/abs/2309.06503v1
- Date: Tue, 12 Sep 2023 18:18:23 GMT
- Title: Leveraging Large Language Models and Weak Supervision for Social Media
data annotation: an evaluation using COVID-19 self-reported vaccination
tweets
- Authors: Ramya Tekumalla and Juan M. Banda
- Abstract summary: Social media platforms have become a popular medium for discussions on vaccine-related topics.
In this study, we evaluate the usage of Large Language Models, in this case GPT-4, and weak supervision, to identify COVID-19 vaccine-related tweets.
- Score: 1.9988653168573556
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The COVID-19 pandemic has presented significant challenges to the healthcare
industry and society as a whole. With the rapid development of COVID-19
vaccines, social media platforms have become a popular medium for discussions
on vaccine-related topics. Identifying vaccine-related tweets and analyzing
them can provide valuable insights for public health research-ers and
policymakers. However, manual annotation of a large number of tweets is
time-consuming and expensive. In this study, we evaluate the usage of Large
Language Models, in this case GPT-4 (March 23 version), and weak supervision,
to identify COVID-19 vaccine-related tweets, with the purpose of comparing
performance against human annotators. We leveraged a manu-ally curated
gold-standard dataset and used GPT-4 to provide labels without any additional
fine-tuning or instructing, in a single-shot mode (no additional prompting).
Related papers
- CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models [92.04812189642418]
We introduce CARES and aim to evaluate the Trustworthiness of Med-LVLMs across the medical domain.
We assess the trustworthiness of Med-LVLMs across five dimensions, including trustfulness, fairness, safety, privacy, and robustness.
arXiv Detail & Related papers (2024-06-10T04:07:09Z) - Analyzing COVID-19 Vaccination Sentiments in Nigerian Cyberspace:
Insights from a Manually Annotated Twitter Dataset [2.820717448579396]
We explore the use of transformer-based language models to study people's acceptance of vaccines in Nigeria.
We developed a novel dataset by crawling multi-lingual tweets using relevant hashtags and keywords.
Our analysis and visualizations revealed that most tweets expressed neutral sentiments about COVID-19 vaccines, with some individuals expressing positive views.
arXiv Detail & Related papers (2024-01-23T22:49:19Z) - Dense Feature Memory Augmented Transformers for COVID-19 Vaccination
Search Classification [60.49594822215981]
This paper presents a classification model for detecting COVID-19 vaccination related search queries.
We propose a novel approach of considering dense features as memory tokens that the model can attend to.
We show that this new modeling approach enables a significant improvement to the Vaccine Search Insights (VSI) task.
arXiv Detail & Related papers (2022-12-16T13:57:41Z) - Doctors vs. Nurses: Understanding the Great Divide in Vaccine Hesitancy
among Healthcare Workers [64.1526243118151]
We find that doctors are overall more positive toward the COVID-19 vaccines.
Doctors are more concerned with the effectiveness of the vaccines over newer variants.
Nurses pay more attention to the potential side effects on children.
arXiv Detail & Related papers (2022-09-11T14:22:16Z) - Vaccine Discourse on Twitter During the COVID-19 Pandemic [0.7161783472741748]
This study investigates posts related to COVID-19 vaccines on Twitter and focuses on those which have a negative stance toward vaccines.
A dataset of 16,713,238 English tweets related to COVID-19 vaccines was collected.
We show that the negativity with respect to COVID-19 vaccines has decreased over time along with the vaccine roll-outs.
arXiv Detail & Related papers (2022-07-23T13:50:51Z) - Deep Learning Reveals Patterns of Diverse and Changing Sentiments
Towards COVID-19 Vaccines Based on 11 Million Tweets [3.319350419970857]
11,211,672 COVID-19 vaccine-related tweets corresponding to 2,203,681 users over two years were analyzed.
We finetuned a deep learning classifier using a state-of-the-art model, XLNet, to detect each tweet's sentiment automatically.
Users from various demographic groups demonstrated distinct patterns in sentiments towards COVID-19 vaccines.
arXiv Detail & Related papers (2022-07-05T13:53:16Z) - A Multilingual Dataset of COVID-19 Vaccination Attitudes on Twitter [4.696697601424039]
We describe the collection and release of a dataset of tweets related to COVID-19 vaccines.
This dataset consists of the IDs of 2,198,090 tweets collected from Western Europe, 17,934 of which are annotated with the originators' vaccination stances.
arXiv Detail & Related papers (2022-06-27T13:44:48Z) - "COVID-19 was a FIFA conspiracy #curropt": An Investigation into the
Viral Spread of COVID-19 Misinformation [60.268682953952506]
We estimate the extent to which misinformation has influenced the course of the COVID-19 pandemic using natural language processing models.
We provide a strategy to combat social media posts that are likely to cause widespread harm.
arXiv Detail & Related papers (2022-06-12T19:41:01Z) - American Twitter Users Revealed Social Determinants-related Oral Health
Disparities amid the COVID-19 Pandemic [72.44305630014534]
We collected oral health-related tweets during the COVID-19 pandemic from 9,104 Twitter users across 26 states.
Women and younger adults (19-29) are more likely to talk about oral health problems.
People from counties at a higher risk of COVID-19 talk more about tooth decay/gum bleeding and chipped tooth/tooth break.
arXiv Detail & Related papers (2021-09-16T01:10:06Z) - Automatic Detection of COVID-19 Vaccine Misinformation with Graph Link
Prediction [2.0625936401496237]
Vaccine hesitancy fueled by social media misinformation about COVID-19 vaccines became a major hurdle.
We present CoVaxLies, a new dataset of tweets judged relevant to several misinformation targets about COVID-19 vaccines.
Our method organizes CoVaxLies in a Misinformation Knowledge Graph as it casts misinformation detection as a graph link prediction problem.
arXiv Detail & Related papers (2021-08-04T23:27:10Z) - Effectiveness and Compliance to Social Distancing During COVID-19 [72.94965109944707]
We use a detailed set of mobility data to evaluate the impact that stay-at-home orders had on the spread of COVID-19 in the US.
We show that there is a unidirectional Granger causality, from the median percentage of time spent daily at home to the daily number of COVID-19-related deaths with a lag of 2 weeks.
arXiv Detail & Related papers (2020-06-23T03:36:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.