A Platform for Investigating Public Health Content with Efficient Concern Classification
- URL: http://arxiv.org/abs/2506.01308v1
- Date: Mon, 02 Jun 2025 04:36:13 GMT
- Title: A Platform for Investigating Public Health Content with Efficient Concern Classification
- Authors: Christopher Li, Rickard Stureborg, Bhuwan Dhingra, Jun Yang,
- Abstract summary: We present ConcernScope, a platform that uses a teacher-student framework for knowledge transfer between large language models and light-weight classifiers.<n>ConcernScope is built on top of a taxonomy of public health concerns and allows uploading massive files directly, automatically scraping specific URLs, and direct text editing.<n>We demonstrate several applications of this platform: guided data exploration to find useful examples of common concerns found in online community datasets, identification of trends in concerns through an example time series analysis of 186,000 samples, and finding trends in topic frequency before and after significant events.
- Score: 9.523478337036588
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A recent rise in online content expressing concerns with public health initiatives has contributed to already stalled uptake of preemptive measures globally. Future public health efforts must attempt to understand such content, what concerns it may raise among readers, and how to effectively respond to it. To this end, we present ConcernScope, a platform that uses a teacher-student framework for knowledge transfer between large language models and light-weight classifiers to quickly and effectively identify the health concerns raised in a text corpus. The platform allows uploading massive files directly, automatically scraping specific URLs, and direct text editing. ConcernScope is built on top of a taxonomy of public health concerns. Intended for public health officials, we demonstrate several applications of this platform: guided data exploration to find useful examples of common concerns found in online community datasets, identification of trends in concerns through an example time series analysis of 186,000 samples, and finding trends in topic frequency before and after significant events.
Related papers
- Systematic Classification of Studies Investigating Social Media Conversations about Long COVID Using a Novel Zero-Shot Transformer Framework [0.0]
Long COVID continues to challenge public health by affecting a considerable number of individuals who have recovered from acute SARS-CoV-2 infection.<n>Social media has emerged as a vital resource for those seeking real-time information, peer support, and validating their health concerns related to Long COVID.
arXiv Detail & Related papers (2025-03-14T20:13:08Z) - SMP Challenge: An Overview and Analysis of Social Media Prediction Challenge [63.311045291016555]
Social Media Popularity Prediction (SMPP) is a crucial task that involves automatically predicting future popularity values of online posts.
This paper summarizes the challenging task, data, and research progress.
arXiv Detail & Related papers (2024-05-17T02:36:14Z) - Theme-driven Keyphrase Extraction to Analyze Social Media Discourse [3.2365983191405103]
This paper introduces a theme-driven keyphrase extraction framework tailored for social media.
We develop a novel data collection and curation framework for theme-driven keyphrase extraction.
We create MOUD-Keyphrase, the first dataset of its kind comprising human-annotated keyphrases from a Reddit community.
arXiv Detail & Related papers (2023-01-27T03:00:46Z) - Semantic Similarity Models for Depression Severity Estimation [53.72188878602294]
This paper presents an efficient semantic pipeline to study depression severity in individuals based on their social media writings.
We use test user sentences for producing semantic rankings over an index of representative training sentences corresponding to depressive symptoms and severity levels.
We evaluate our methods on two Reddit-based benchmarks, achieving 30% improvement over state of the art in terms of measuring depression severity.
arXiv Detail & Related papers (2022-11-14T18:47:26Z) - Forecasting User Interests Through Topic Tag Predictions in Online
Health Communities [16.088586964818703]
This paper proposes an innovative approach to suggesting reliable information to participants in online communities.
We pose the problem of predicting topic tags that describe the future information needs of users based on their profiles.
The result is a variant of the collaborative information filtering or recommendation system tailored to the needs of users of online health communities.
arXiv Detail & Related papers (2022-11-05T00:09:45Z) - On Curating Responsible and Representative Healthcare Video
Recommendations for Patient Education and Health Literacy: An Augmented
Intelligence Approach [5.545277272908999]
One in three U.S. adults use the Internet to diagnose or learn about a health concern.
Health literacy divides can be exacerbated by algorithmic recommendations.
arXiv Detail & Related papers (2022-07-13T01:54:59Z) - Should we tweet this? Generative response modeling for predicting
reception of public health messaging on Twitter [0.8399688944263843]
We collect two datasets of public health messages and their responses from Twitter relating to COVID-19 and Vaccines.
We introduce a predictive method which can be used to explore the potential reception of such messages.
Specifically, we harness a generative model (GPT-2) to directly predict probable future responses and demonstrate how it can be used to optimize expected reception of important health guidance.
arXiv Detail & Related papers (2022-04-09T01:56:46Z) - BEV-Net: Assessing Social Distancing Compliance by Joint People
Localization and Geometric Reasoning [77.08836528980248]
Social distancing, an essential public health measure, has gained significant attention since the outbreak of the COVID-19 pandemic.
In this work, the problem of visual social distancing compliance assessment in busy public areas with wide field-of-view cameras is considered.
A dataset of crowd scenes with people annotations under a bird's eye view (BEV) and ground truth for metric distances is introduced.
A multi-branch network, BEV-Net, is proposed to localize individuals in world coordinates and identify high-risk regions where social distancing is violated.
arXiv Detail & Related papers (2021-10-10T23:56:37Z) - Health Status Prediction with Local-Global Heterogeneous Behavior Graph [69.99431339130105]
Estimation of health status can be achieved with various kinds of data streams continuously collected from wearable sensors.
We propose to model the behavior-related multi-source data streams with a local-global graph.
We take experiments on StudentLife dataset, and extensive results demonstrate the effectiveness of our proposed model.
arXiv Detail & Related papers (2021-03-23T11:10:04Z) - Pulse of the Pandemic: Iterative Topic Filtering for Clinical
Information Extraction from Social Media [1.5938324336156293]
The rapid evolution of the COVID-19 pandemic has underscored the need to quickly disseminate the latest clinical knowledge during a public-health emergency.
We present an unsupervised, iterative approach to mine clinically relevant information from social media data.
This approach identifies granular topics and tweets with high clinical relevance from a set of about 52 million COVID-19-related tweets.
arXiv Detail & Related papers (2021-02-13T01:01:04Z) - Assessing the Severity of Health States based on Social Media Posts [62.52087340582502]
We propose a multiview learning framework that models both the textual content as well as contextual-information to assess the severity of the user's health state.
The diverse NLU views demonstrate its effectiveness on both the tasks and as well as on the individual disease to assess a user's health.
arXiv Detail & Related papers (2020-09-21T03:45:14Z) - Leveraging Multi-Source Weak Social Supervision for Early Detection of
Fake News [67.53424807783414]
Social media has greatly enabled people to participate in online activities at an unprecedented rate.
This unrestricted access also exacerbates the spread of misinformation and fake news online which might cause confusion and chaos unless being detected early for its mitigation.
We jointly leverage the limited amount of clean data along with weak signals from social engagements to train deep neural networks in a meta-learning framework to estimate the quality of different weak instances.
Experiments on realworld datasets demonstrate that the proposed framework outperforms state-of-the-art baselines for early detection of fake news without using any user engagements at prediction time.
arXiv Detail & Related papers (2020-04-03T18:26:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.