Addressing machine learning concept drift reveals declining vaccine
sentiment during the COVID-19 pandemic
- URL: http://arxiv.org/abs/2012.02197v2
- Date: Mon, 7 Dec 2020 11:28:31 GMT
- Title: Addressing machine learning concept drift reveals declining vaccine
sentiment during the COVID-19 pandemic
- Authors: Martin M\"uller, Marcel Salath\'e
- Abstract summary: We show that machine learning algorithms trained on annotated data in the past may underperform when applied to contemporary data.
We show that while vaccine sentiment has declined considerably during the COVID-19 pandemic in 2020, algorithms trained on pre-pandemic data would have largely missed this decline due to concept drift.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Social media analysis has become a common approach to assess public opinion
on various topics, including those about health, in near real-time. The growing
volume of social media posts has led to an increased usage of modern machine
learning methods in natural language processing. While the rapid dynamics of
social media can capture underlying trends quickly, it also poses a technical
problem: algorithms trained on annotated data in the past may underperform when
applied to contemporary data. This phenomenon, known as concept drift, can be
particularly problematic when rapid shifts occur either in the topic of
interest itself, or in the way the topic is discussed. Here, we explore the
effect of machine learning concept drift by focussing on vaccine sentiments
expressed on Twitter, a topic of central importance especially during the
COVID-19 pandemic. We show that while vaccine sentiment has declined
considerably during the COVID-19 pandemic in 2020, algorithms trained on
pre-pandemic data would have largely missed this decline due to concept drift.
Our results suggest that social media analysis systems must address concept
drift in a continuous fashion in order to avoid the risk of systematic
misclassification of data, which is particularly likely during a crisis when
the underlying data can change suddenly and rapidly.
Related papers
- Revealing COVID-19's Social Dynamics: Diachronic Semantic Analysis of Vaccine and Symptom Discourse on Twitter [12.75089285888253]
This paper proposes an unsupervised dynamic word embedding method to capture longitudinal semantic shifts in social media data without predefined anchor words.
Evaluated on a large COVID-19 Twitter dataset, the method reveals semantic evolution patterns of vaccine- and symptom-related entities across different pandemic stages.
arXiv Detail & Related papers (2024-10-10T20:15:28Z) - Event Detection from Social Media for Epidemic Prediction [76.90779562626541]
We develop a framework to extract and analyze epidemic-related events from social media posts.
Experimentation reveals how ED models trained on COVID-based SPEED can effectively detect epidemic events for three unseen epidemics.
We show that reporting sharp increases in the extracted events by our framework can provide warnings 4-9 weeks earlier than the WHO epidemic declaration for Monkeypox.
arXiv Detail & Related papers (2024-04-02T06:31:17Z) - When Infodemic Meets Epidemic: a Systematic Literature Review [3.3454373538792543]
Social media offer significant amounts of data that can be leveraged for bio-surveillance.
This systematic literature review provides a methodical overview of the integration of social media in different epidemic-related contexts.
arXiv Detail & Related papers (2022-10-03T21:04:30Z) - Data-Centric Epidemic Forecasting: A Survey [56.99209141838794]
This survey delves into various data-driven methodological and practical advancements.
We enumerate the large number of epidemiological datasets and novel data streams that are relevant to epidemic forecasting.
We also discuss experiences and challenges that arise in real-world deployment of these forecasting systems.
arXiv Detail & Related papers (2022-07-19T16:15:11Z) - Adherence to Misinformation on Social Media Through Socio-Cognitive and
Group-Based Processes [79.79659145328856]
We argue that when misinformation proliferates, this happens because the social media environment enables adherence to misinformation.
We make the case that polarization and misinformation adherence are closely tied.
arXiv Detail & Related papers (2022-06-30T12:34:24Z) - Reducing Catastrophic Forgetting in Self Organizing Maps with
Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data.
One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples.
This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z) - #StayHome or #Marathon? Social Media Enhanced Pandemic Surveillance on
Spatial-temporal Dynamic Graphs [23.67939019353524]
COVID-19 has caused lasting damage to almost every domain in public health, society, and economy.
Existing studies rely on the aggregation of traditional statistical models and epidemic spread theory.
We propose a novel framework, Social Media enhAnced pandemic knowledge based on the extracted events and relationships.
arXiv Detail & Related papers (2021-08-08T15:46:05Z) - Pulse of the Pandemic: Iterative Topic Filtering for Clinical
Information Extraction from Social Media [1.5938324336156293]
The rapid evolution of the COVID-19 pandemic has underscored the need to quickly disseminate the latest clinical knowledge during a public-health emergency.
We present an unsupervised, iterative approach to mine clinically relevant information from social media data.
This approach identifies granular topics and tweets with high clinical relevance from a set of about 52 million COVID-19-related tweets.
arXiv Detail & Related papers (2021-02-13T01:01:04Z) - Capturing social media expressions during the COVID-19 pandemic in
Argentina and forecasting mental health and emotions [0.802904964931021]
We forecast mental health conditions and emotions of a given population during the COVID-19 pandemic in Argentina based on language expressions used in social media.
Mental health conditions and emotions are captured via markers, which link social media contents with lexicons.
arXiv Detail & Related papers (2021-01-12T15:15:31Z) - Epidemic mitigation by statistical inference from contact tracing data [61.04165571425021]
We develop Bayesian inference methods to estimate the risk that an individual is infected.
We propose to use probabilistic risk estimation in order to optimize testing and quarantining strategies for the control of an epidemic.
Our approaches translate into fully distributed algorithms that only require communication between individuals who have recently been in contact.
arXiv Detail & Related papers (2020-09-20T12:24:45Z) - COVI White Paper [67.04578448931741]
Contact tracing is an essential tool to change the course of the Covid-19 pandemic.
We present an overview of the rationale, design, ethical considerations and privacy strategy of COVI,' a Covid-19 public peer-to-peer contact tracing and risk awareness mobile application developed in Canada.
arXiv Detail & Related papers (2020-05-18T07:40:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.