Balanced and Explainable Social Media Analysis for Public Health with
Large Language Models
- URL: http://arxiv.org/abs/2309.05951v1
- Date: Tue, 12 Sep 2023 04:15:34 GMT
- Title: Balanced and Explainable Social Media Analysis for Public Health with
Large Language Models
- Authors: Yan Jiang, Ruihong Qiu, Yi Zhang, Peng-Fei Zhang
- Abstract summary: Current techniques for public health analysis involve popular models such as BERT and large language models (LLMs)
To tackle these challenges, the data imbalance issue can be overcome by sophisticated data augmentation methods for social media datasets.
In this paper, a novel ALEX framework is proposed for social media analysis on public health.
- Score: 13.977401672173533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As social media becomes increasingly popular, more and more public health
activities emerge, which is worth noting for pandemic monitoring and government
decision-making. Current techniques for public health analysis involve popular
models such as BERT and large language models (LLMs). Although recent progress
in LLMs has shown a strong ability to comprehend knowledge by being fine-tuned
on specific domain datasets, the costs of training an in-domain LLM for every
specific public health task are especially expensive. Furthermore, such kinds
of in-domain datasets from social media are generally highly imbalanced, which
will hinder the efficiency of LLMs tuning. To tackle these challenges, the data
imbalance issue can be overcome by sophisticated data augmentation methods for
social media datasets. In addition, the ability of the LLMs can be effectively
utilised by prompting the model properly. In light of the above discussion, in
this paper, a novel ALEX framework is proposed for social media analysis on
public health. Specifically, an augmentation pipeline is developed to resolve
the data imbalance issue. Furthermore, an LLMs explanation mechanism is
proposed by prompting an LLM with the predicted results from BERT models.
Extensive experiments conducted on three tasks at the Social Media Mining for
Health 2023 (SMM4H) competition with the first ranking in two tasks demonstrate
the superior performance of the proposed ALEX method. Our code has been
released in https://github.com/YanJiangJerry/ALEX.
Related papers
- Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation [51.44040615856536]
This paper analyzes large language models' ability to simulate social media engagement through action guided response generation.
We benchmark GPT-4o-mini, O1-mini, and DeepSeek-R1 in social media engagement simulation regarding a major societal event.
arXiv Detail & Related papers (2025-02-17T17:43:08Z) - Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation [55.21013307734612]
AoPS-Instruct is a dataset of more than 600,000 high-quality QA pairs.
LiveAoPSBench is an evolving evaluation set with timestamps, derived from the latest forum data.
Our work presents a scalable approach to creating and maintaining large-scale, high-quality datasets for advanced math reasoning.
arXiv Detail & Related papers (2025-01-24T06:39:38Z) - Evaluating the Performance of Large Language Models in Scientific Claim Detection and Classification [0.0]
This study evaluates the efficacy of Large Language Models (LLMs) as innovative solutions for mitigating misinformation on platforms like Twitter.
LLMs offer a pre-trained, adaptable approach that bypasses the extensive training and overfitting issues associated with traditional machine learning models.
We present a comparative analysis of LLMs' performance using a specialized dataset and propose a framework for their application in public health communication.
arXiv Detail & Related papers (2024-12-21T05:02:26Z) - A Multi-LLM Debiasing Framework [85.17156744155915]
Large Language Models (LLMs) are powerful tools with the potential to benefit society immensely, yet, they have demonstrated biases that perpetuate societal inequalities.
Recent research has shown a growing interest in multi-LLM approaches, which have been demonstrated to be effective in improving the quality of reasoning.
We propose a novel multi-LLM debiasing framework aimed at reducing bias in LLMs.
arXiv Detail & Related papers (2024-09-20T20:24:50Z) - Social Debiasing for Fair Multi-modal LLMs [55.8071045346024]
Multi-modal Large Language Models (MLLMs) have advanced significantly, offering powerful vision-language understanding capabilities.
However, these models often inherit severe social biases from their training datasets, leading to unfair predictions based on attributes like race and gender.
This paper addresses the issue of social biases in MLLMs by i) Introducing a comprehensive Counterfactual dataset with Multiple Social Concepts (CMSC) and ii) Proposing an Anti-Stereotype Debiasing strategy (ASD)
arXiv Detail & Related papers (2024-08-13T02:08:32Z) - ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs [65.9625653425636]
Large Language models (LLMs) exhibit harmful social biases.
This work introduces a novel approach utilizing ChatGPT to generate synthetic training data.
arXiv Detail & Related papers (2024-02-19T01:28:48Z) - Retrieval Augmented Thought Process for Private Data Handling in Healthcare [53.89406286212502]
We introduce the Retrieval-Augmented Thought Process (RATP)
RATP formulates the thought generation of Large Language Models (LLMs)
On a private dataset of electronic medical records, RATP achieves 35% additional accuracy compared to in-context retrieval-augmented generation for the question-answering task.
arXiv Detail & Related papers (2024-02-12T17:17:50Z) - Countering Misinformation via Emotional Response Generation [15.383062216223971]
proliferation of misinformation on social media platforms (SMPs) poses a significant danger to public health, social cohesion and democracy.
Previous research has shown how social correction can be an effective way to curb misinformation.
We present VerMouth, the first large-scale dataset comprising roughly 12 thousand claim-response pairs.
arXiv Detail & Related papers (2023-11-17T15:37:18Z) - Automated Claim Matching with Large Language Models: Empowering
Fact-Checkers in the Fight Against Misinformation [11.323961700172175]
FACT-GPT is a framework designed to automate the claim matching phase of fact-checking using Large Language Models.
This framework identifies new social media content that either supports or contradicts claims previously debunked by fact-checkers.
We evaluated FACT-GPT on an extensive dataset of social media content related to public health.
arXiv Detail & Related papers (2023-10-13T16:21:07Z) - A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics [32.10937977924507]
The utilization of large language models (LLMs) in the Healthcare domain has generated both excitement and concern.
This survey outlines the capabilities of the currently developed LLMs for Healthcare and explicates their development process.
arXiv Detail & Related papers (2023-10-09T13:15:23Z) - UQ at #SMM4H 2023: ALEX for Public Health Analysis with Social Media [33.081637097464146]
Current techniques for public health analysis involve popular models such as BERT and large language models (LLMs)
In this paper, a novel ALEX framework is proposed to improve the performance of public health analysis on social media.
arXiv Detail & Related papers (2023-09-08T08:54:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.