UQ at #SMM4H 2023: ALEX for Public Health Analysis with Social Media
- URL: http://arxiv.org/abs/2309.04213v2
- Date: Tue, 12 Sep 2023 07:19:22 GMT
- Title: UQ at #SMM4H 2023: ALEX for Public Health Analysis with Social Media
- Authors: Yan Jiang, Ruihong Qiu, Yi Zhang, Zi Huang
- Abstract summary: Current techniques for public health analysis involve popular models such as BERT and large language models (LLMs)
In this paper, a novel ALEX framework is proposed to improve the performance of public health analysis on social media.
- Score: 33.081637097464146
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As social media becomes increasingly popular, more and more activities
related to public health emerge. Current techniques for public health analysis
involve popular models such as BERT and large language models (LLMs). However,
the costs of training in-domain LLMs for public health are especially
expensive. Furthermore, such kinds of in-domain datasets from social media are
generally imbalanced. To tackle these challenges, the data imbalance issue can
be overcome by data augmentation and balanced training. Moreover, the ability
of the LLMs can be effectively utilized by prompting the model properly. In
this paper, a novel ALEX framework is proposed to improve the performance of
public health analysis on social media by adopting an LLMs explanation
mechanism. Results show that our ALEX model got the best performance among all
submissions in both Task 2 and Task 4 with a high score in Task 1 in Social
Media Mining for Health 2023 (SMM4H)[1]. Our code has been released at https://
github.com/YanJiangJerry/ALEX.
Related papers
- Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation [51.44040615856536]
This paper analyzes large language models' ability to simulate social media engagement through action guided response generation.
We benchmark GPT-4o-mini, O1-mini, and DeepSeek-R1 in social media engagement simulation regarding a major societal event.
arXiv Detail & Related papers (2025-02-17T17:43:08Z) - Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation [55.21013307734612]
AoPS-Instruct is a dataset of more than 600,000 high-quality QA pairs.
LiveAoPSBench is an evolving evaluation set with timestamps, derived from the latest forum data.
Our work presents a scalable approach to creating and maintaining large-scale, high-quality datasets for advanced math reasoning.
arXiv Detail & Related papers (2025-01-24T06:39:38Z) - Question Answering on Patient Medical Records with Private Fine-Tuned LLMs [1.8524621910043437]
Large Language Models (LLMs) enable semantic question answering (QA) over medical data.
ensuring privacy and compliance requires edge and private deployments of LLMs.
We evaluate privately hosted, fine-tuned LLMs against benchmark models such as GPT-4 and GPT-4o.
arXiv Detail & Related papers (2025-01-23T14:13:56Z) - SS-GEN: A Social Story Generation Framework with Large Language Models [87.11067593512716]
Children with Autism Spectrum Disorder (ASD) often misunderstand social situations and struggle to participate in daily routines.
Social Stories are traditionally crafted by psychology experts under strict constraints to address these challenges.
We propose textbfSS-GEN, a framework to generate Social Stories in real-time with broad coverage.
arXiv Detail & Related papers (2024-06-22T00:14:48Z) - Retrieval Augmented Thought Process for Private Data Handling in Healthcare [53.89406286212502]
We introduce the Retrieval-Augmented Thought Process (RATP)
RATP formulates the thought generation of Large Language Models (LLMs)
On a private dataset of electronic medical records, RATP achieves 35% additional accuracy compared to in-context retrieval-augmented generation for the question-answering task.
arXiv Detail & Related papers (2024-02-12T17:17:50Z) - Explorers at #SMM4H 2023: Enhancing BERT for Health Applications through
Knowledge and Model Fusion [3.386401892906348]
Social media has become a valuable data resource for studying human health.
This paper outlines the methods in our participation in the #SMM4H 2023 Shared Tasks.
arXiv Detail & Related papers (2023-12-17T08:52:05Z) - Countering Misinformation via Emotional Response Generation [15.383062216223971]
proliferation of misinformation on social media platforms (SMPs) poses a significant danger to public health, social cohesion and democracy.
Previous research has shown how social correction can be an effective way to curb misinformation.
We present VerMouth, the first large-scale dataset comprising roughly 12 thousand claim-response pairs.
arXiv Detail & Related papers (2023-11-17T15:37:18Z) - Balanced and Explainable Social Media Analysis for Public Health with
Large Language Models [13.977401672173533]
Current techniques for public health analysis involve popular models such as BERT and large language models (LLMs)
To tackle these challenges, the data imbalance issue can be overcome by sophisticated data augmentation methods for social media datasets.
In this paper, a novel ALEX framework is proposed for social media analysis on public health.
arXiv Detail & Related papers (2023-09-12T04:15:34Z) - ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media [74.93847489218008]
We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.
To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.
Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
arXiv Detail & Related papers (2023-05-23T16:40:07Z) - Benchmarking for Public Health Surveillance tasks on Social Media with a
Domain-Specific Pretrained Language Model [9.070482285386387]
We present PHS-BERT, a transformer-based language model to identify tasks related to public health surveillance on social media.
Compared with existing PLMs that are mainly evaluated on limited tasks, PHS-BERT achieved state-of-the-art performance on all 25 tested datasets.
arXiv Detail & Related papers (2022-04-09T18:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.