Evaluation of ChatGPT for NLP-based Mental Health Applications
- URL: http://arxiv.org/abs/2303.15727v1
- Date: Tue, 28 Mar 2023 04:47:43 GMT
- Title: Evaluation of ChatGPT for NLP-based Mental Health Applications
- Authors: Bishal Lamichhane
- Abstract summary: Large language models (LLM) have been successful in several natural language understanding tasks.
In this work, we report the performance of LLM-based ChatGPT in three text-based mental health classification tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLM) have been successful in several natural language
understanding tasks and could be relevant for natural language processing
(NLP)-based mental health application research. In this work, we report the
performance of LLM-based ChatGPT (with gpt-3.5-turbo backend) in three
text-based mental health classification tasks: stress detection (2-class
classification), depression detection (2-class classification), and suicidality
detection (5-class classification). We obtained annotated social media posts
for the three classification tasks from public datasets. Then ChatGPT API
classified the social media posts with an input prompt for classification. We
obtained F1 scores of 0.73, 0.86, and 0.37 for stress detection, depression
detection, and suicidality detection, respectively. A baseline model that
always predicted the dominant class resulted in F1 scores of 0.35, 0.60, and
0.19. The zero-shot classification accuracy obtained with ChatGPT indicates a
potential use of language models for mental health classification tasks.
Related papers
- Evaluating the Effectiveness of the Foundational Models for Q&A Classification in Mental Health care [0.18416014644193068]
Pre-trained Language Models (PLMs) have the potential to transform mental health support.
This study evaluates the effectiveness of PLMs for classification of Questions and Answers in the domain of mental health care.
arXiv Detail & Related papers (2024-06-23T00:11:07Z) - ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents [49.00494558898933]
This paper describes our participation in Task 3 and Task 5 of the #SMM4H (Social Media Mining for Health) 2024 Workshop.
Task 3 is a multi-class classification task centered on tweets discussing the impact of outdoor environments on symptoms of social anxiety.
Task 5 involves a binary classification task focusing on tweets reporting medical disorders in children.
We applied transfer learning from pre-trained encoder-decoder models such as BART-base and T5-small to identify the labels of a set of given tweets.
arXiv Detail & Related papers (2024-04-30T17:06:20Z) - Mental Health Diagnosis in the Digital Age: Harnessing Sentiment
Analysis on Social Media Platforms upon Ultra-Sparse Feature Content [3.6195994708545016]
We propose a novel semantic feature preprocessing technique with a three-folded structure.
With enhanced semantic features, we train a machine learning model to predict and classify mental disorders.
Our methods, when compared to seven benchmark models, demonstrate significant performance improvements.
arXiv Detail & Related papers (2023-11-09T00:15:06Z) - PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for
Personality Detection [50.66968526809069]
We propose a novel personality detection method, called PsyCoT, which mimics the way individuals complete psychological questionnaires in a multi-turn dialogue manner.
Our experiments demonstrate that PsyCoT significantly improves the performance and robustness of GPT-3.5 in personality detection.
arXiv Detail & Related papers (2023-10-31T08:23:33Z) - Supervised Learning and Large Language Model Benchmarks on Mental Health Datasets: Cognitive Distortions and Suicidal Risks in Chinese Social Media [23.49883142003182]
We introduce two novel datasets from Chinese social media: SOS-HL-1K for suicidal risk classification and SocialCD-3K for cognitive distortions detection.
We propose a comprehensive evaluation using two supervised learning methods and eight large language models (LLMs) on the proposed datasets.
arXiv Detail & Related papers (2023-09-07T08:50:46Z) - Automatically measuring speech fluency in people with aphasia: first
achievements using read-speech data [55.84746218227712]
This study aims at assessing the relevance of a signalprocessingalgorithm, initially developed in the field of language acquisition, for the automatic measurement of speech fluency.
arXiv Detail & Related papers (2023-08-09T07:51:40Z) - Pushing the Limits of ChatGPT on NLP Tasks [79.17291002710517]
Despite the success of ChatGPT, its performances on most NLP tasks are still well below the supervised baselines.
In this work, we looked into the causes, and discovered that its subpar performance was caused by the following factors.
We propose a collection of general modules to address these issues, in an attempt to push the limits of ChatGPT on NLP tasks.
arXiv Detail & Related papers (2023-06-16T09:40:05Z) - Systematic Evaluation of GPT-3 for Zero-Shot Personality Estimation [12.777659013330823]
GPT-3 is used to estimate the Big 5 personality traits from users' social media posts.
We find that GPT-3 performance is somewhat close to an existing pre-trained SotA for broad classification.
We analyze where GPT-3 performs better, as well as worse, than a pretrained lexical model, illustrating systematic errors.
arXiv Detail & Related papers (2023-06-01T22:43:37Z) - To ChatGPT, or not to ChatGPT: That is the question! [78.407861566006]
This study provides a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection.
We have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains.
Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.
arXiv Detail & Related papers (2023-04-04T03:04:28Z) - Is ChatGPT a General-Purpose Natural Language Processing Task Solver? [113.22611481694825]
Large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot.
Recently, the debut of ChatGPT has drawn a great deal of attention from the natural language processing (NLP) community.
It is not yet known whether ChatGPT can serve as a generalist model that can perform many NLP tasks zero-shot.
arXiv Detail & Related papers (2023-02-08T09:44:51Z) - Detecting Reddit Users with Depression Using a Hybrid Neural Network
SBERT-CNN [18.32536789799511]
Depression is a widespread mental health issue, affecting an estimated 3.8% of the global population.
We propose a hybrid deep learning model which combines a pretrained sentence BERT (SBERT) and convolutional neural network (CNN) to detect individuals with depression with their Reddit posts.
The model achieved an accuracy of 0.86 and an F1 score of 0.86 and outperformed the state-of-the-art documented result (F1 score of 0.79) by other machine learning models in the literature.
arXiv Detail & Related papers (2023-02-03T06:22:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.