Supervised Learning and Large Language Model Benchmarks on Mental Health Datasets: Cognitive Distortions and Suicidal Risks in Chinese Social Media
- URL: http://arxiv.org/abs/2309.03564v3
- Date: Sun, 9 Jun 2024 12:49:52 GMT
- Title: Supervised Learning and Large Language Model Benchmarks on Mental Health Datasets: Cognitive Distortions and Suicidal Risks in Chinese Social Media
- Authors: Hongzhi Qi, Qing Zhao, Jianqiang Li, Changwei Song, Wei Zhai, Dan Luo, Shuo Liu, Yi Jing Yu, Fan Wang, Huijing Zou, Bing Xiang Yang, Guanghui Fu,
- Abstract summary: We introduce two novel datasets from Chinese social media: SOS-HL-1K for suicidal risk classification and SocialCD-3K for cognitive distortions detection.
We propose a comprehensive evaluation using two supervised learning methods and eight large language models (LLMs) on the proposed datasets.
- Score: 23.49883142003182
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: On social media, users often express their personal feelings, which may exhibit cognitive distortions or even suicidal tendencies on certain specific topics. Early recognition of these signs is critical for effective psychological intervention. In this paper, we introduce two novel datasets from Chinese social media: SOS-HL-1K for suicidal risk classification and SocialCD-3K for cognitive distortions detection. The SOS-HL-1K dataset contained 1,249 posts and SocialCD-3K dataset was a multi-label classification dataset that containing 3,407 posts. We propose a comprehensive evaluation using two supervised learning methods and eight large language models (LLMs) on the proposed datasets. From the prompt engineering perspective, we experimented with two types of prompt strategies, including four zero-shot and five few-shot strategies. We also evaluated the performance of the LLMs after fine-tuning on the proposed tasks. The experimental results show that there is still a huge gap between LLMs relying only on prompt engineering and supervised learning. In the suicide classification task, this gap is 6.95% points in F1-score, while in the cognitive distortion task, the gap is even more pronounced, reaching 31.53% points in F1-score. However, after fine-tuning, this difference is significantly reduced. In the suicide and cognitive distortion classification tasks, the gap decreases to 4.31% and 3.14%, respectively. This research highlights the potential of LLMs in psychological contexts, but supervised learning remains necessary for more challenging tasks. All datasets and code are made available.
Related papers
- Decoupling the Class Label and the Target Concept in Machine Unlearning [81.69857244976123]
Machine unlearning aims to adjust a trained model to approximate a retrained one that excludes a portion of training data.
Previous studies showed that class-wise unlearning is successful in forgetting the knowledge of a target class.
We propose a general framework, namely, TARget-aware Forgetting (TARF)
arXiv Detail & Related papers (2024-06-12T14:53:30Z) - ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents [49.00494558898933]
This paper describes our participation in Task 3 and Task 5 of the #SMM4H (Social Media Mining for Health) 2024 Workshop.
Task 3 is a multi-class classification task centered on tweets discussing the impact of outdoor environments on symptoms of social anxiety.
Task 5 involves a binary classification task focusing on tweets reporting medical disorders in children.
We applied transfer learning from pre-trained encoder-decoder models such as BART-base and T5-small to identify the labels of a set of given tweets.
arXiv Detail & Related papers (2024-04-30T17:06:20Z) - SOS-1K: A Fine-grained Suicide Risk Classification Dataset for Chinese Social Media Analysis [22.709733830774788]
This study presents a Chinese social media dataset designed for fine-grained suicide risk classification.
Seven pre-trained models were evaluated in two tasks: high and low suicide risk, and fine-grained suicide risk classification on a level of 0 to 10.
Deep learning models show good performance in distinguishing between high and low suicide risk, with the best model achieving an F1 score of 88.39%.
arXiv Detail & Related papers (2024-04-19T06:58:51Z) - AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts [27.240795549935463]
We gathered data from social media and established the task of extracting cognitive pathways.
We structured a text summarization task to help psychotherapists quickly grasp the essential information.
Our experiments evaluate the performance of deep learning and large language models.
arXiv Detail & Related papers (2024-04-17T14:55:27Z) - Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus [99.33091772494751]
Large Language Models (LLMs) have gained significant popularity for their impressive performance across diverse fields.
LLMs are prone to hallucinate untruthful or nonsensical outputs that fail to meet user expectations.
We propose a novel reference-free, uncertainty-based method for detecting hallucinations in LLMs.
arXiv Detail & Related papers (2023-11-22T08:39:17Z) - Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs.
We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z) - Evaluation of ChatGPT for NLP-based Mental Health Applications [0.0]
Large language models (LLM) have been successful in several natural language understanding tasks.
In this work, we report the performance of LLM-based ChatGPT in three text-based mental health classification tasks.
arXiv Detail & Related papers (2023-03-28T04:47:43Z) - A Quantitative and Qualitative Analysis of Suicide Ideation Detection
using Deep Learning [5.192118773220605]
This paper replicated competitive social media-based suicidality detection/prediction models.
We evaluated the feasibility of detecting suicidal ideation using multiple datasets and different state-of-the-art deep learning models.
arXiv Detail & Related papers (2022-06-17T10:23:37Z) - Detecting Potentially Harmful and Protective Suicide-related Content on
Twitter: A Machine Learning Approach [0.1582078748632554]
We apply machine learning methods to automatically label large quantities of Twitter data.
Two deep learning models achieved the best performance in two classification tasks.
This work enables future large-scale investigations on harmful and protective effects of various kinds of social media content on suicide rates and on help-seeking behavior.
arXiv Detail & Related papers (2021-12-09T09:35:48Z) - LID 2020: The Learning from Imperfect Data Challenge Results [242.86700551532272]
Learning from Imperfect Data workshop aims to inspire and facilitate the research in developing novel approaches.
We organize three challenges to find the state-of-the-art approaches in weakly supervised learning setting.
This technical report summarizes the highlights from the challenge.
arXiv Detail & Related papers (2020-10-17T13:06:12Z) - Deep F-measure Maximization for End-to-End Speech Understanding [52.36496114728355]
We propose a differentiable approximation to the F-measure and train the network with this objective using standard backpropagation.
We perform experiments on two standard fairness datasets, Adult, Communities and Crime, and also on speech-to-intent detection on the ATIS dataset and speech-to-image concept classification on the Speech-COCO dataset.
In all four of these tasks, F-measure results in improved micro-F1 scores, with absolute improvements of up to 8% absolute, as compared to models trained with the cross-entropy loss function.
arXiv Detail & Related papers (2020-08-08T03:02:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.