AI in Mental Health: Emotional and Sentiment Analysis of Large Language Models' Responses to Depression, Anxiety, and Stress Queries
- URL: http://arxiv.org/abs/2508.11285v1
- Date: Fri, 15 Aug 2025 07:47:10 GMT
- Title: AI in Mental Health: Emotional and Sentiment Analysis of Large Language Models' Responses to Depression, Anxiety, and Stress Queries
- Authors: Arya VarastehNezhad, Reza Tavasoli, Soroush Elyasi, MohammadHossein LotfiNia, Hamed Farbeh,
- Abstract summary: Depression, anxiety, and stress are widespread mental health concerns that increasingly drive individuals to seek information from Large Language Models (LLMs)<n>This study investigates how eight LLMs reply to twenty pragmatic questions about depression, anxiety, and stress when those questions are framed for six user profiles.<n>The models generated 2,880 answers, which we scored for sentiment and emotions using state-of-the-art tools.
- Score: 1.1068280788997429
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Depression, anxiety, and stress are widespread mental health concerns that increasingly drive individuals to seek information from Large Language Models (LLMs). This study investigates how eight LLMs (Claude Sonnet, Copilot, Gemini Pro, GPT-4o, GPT-4o mini, Llama, Mixtral, and Perplexity) reply to twenty pragmatic questions about depression, anxiety, and stress when those questions are framed for six user profiles (baseline, woman, man, young, old, and university student). The models generated 2,880 answers, which we scored for sentiment and emotions using state-of-the-art tools. Our analysis revealed that optimism, fear, and sadness dominated the emotional landscape across all outputs, with neutral sentiment maintaining consistently high values. Gratitude, joy, and trust appeared at moderate levels, while emotions such as anger, disgust, and love were rarely expressed. The choice of LLM significantly influenced emotional expression patterns. Mixtral exhibited the highest levels of negative emotions including disapproval, annoyance, and sadness, while Llama demonstrated the most optimistic and joyful responses. The type of mental health condition dramatically shaped emotional responses: anxiety prompts elicited extraordinarily high fear scores (0.974), depression prompts generated elevated sadness (0.686) and the highest negative sentiment, while stress-related queries produced the most optimistic responses (0.755) with elevated joy and trust. In contrast, demographic framing of queries produced only marginal variations in emotional tone. Statistical analyses confirmed significant model-specific and condition-specific differences, while demographic influences remained minimal. These findings highlight the critical importance of model selection in mental health applications, as each LLM exhibits a distinct emotional signature that could significantly impact user experience and outcomes.
Related papers
- Predicting Depressive Symptoms through Emotion Pairs within Asian American Families [0.3823356975862005]
This study investigates the role of ambivalent emotions in online narratives shared by Asian and Asian American children on the subreddit, r/Asianparentstories.<n>By employing a BERT-based model to detect emotion at the sentence level and depressive symptoms at the post level, we analyze mixed feelings to better understand how they predict depressive symptoms.
arXiv Detail & Related papers (2026-02-03T19:04:30Z) - Emotion Granularity from Text: An Aggregate-Level Indicator of Mental Health [25.166884750592175]
In psychology, variation in the ability of individuals to differentiate between emotion concepts is called emotion granularity.
High emotion granularity has been linked with better mental and physical health.
Low emotion granularity has been linked with maladaptive emotion regulation strategies and poor health outcomes.
arXiv Detail & Related papers (2024-03-04T18:12:10Z) - CauESC: A Causal Aware Model for Emotional Support Conversation [79.4451588204647]
Existing approaches ignore the emotion causes of the distress.
They focus on the seeker's own mental state rather than the emotional dynamics during interaction between speakers.
We propose a novel framework CauESC, which firstly recognizes the emotion causes of the distress, as well as the emotion effects triggered by the causes.
arXiv Detail & Related papers (2024-01-31T11:30:24Z) - Enhancing Emotional Generation Capability of Large Language Models via Emotional Chain-of-Thought [50.13429055093534]
Large Language Models (LLMs) have shown remarkable performance in various emotion recognition tasks.
We propose the Emotional Chain-of-Thought (ECoT) to enhance the performance of LLMs on various emotional generation tasks.
arXiv Detail & Related papers (2024-01-12T16:42:10Z) - DepressionEmo: A novel dataset for multilabel classification of
depression emotions [6.26397257917403]
DepressionEmo is a dataset designed to detect 8 emotions associated with depression by 6037 examples of long Reddit user posts.
This dataset was created through a majority vote over inputs by zero-shot classifications from pre-trained models.
We provide several text classification methods classified into two groups: machine learning methods such as SVM, XGBoost, and Light GBM; and deep learning methods such as BERT, GAN-BERT, and BART.
arXiv Detail & Related papers (2024-01-09T16:25:31Z) - Language and Mental Health: Measures of Emotion Dynamics from Text as
Linguistic Biosocial Markers [30.656554495536618]
We study the relationship between tweet emotion dynamics and mental health disorders.
We find that each of the UED metrics studied varied by the user's self-disclosed diagnosis.
This work provides important early evidence for how linguistic cues pertaining to emotion dynamics can play a crucial role as biosocial markers for mental illnesses.
arXiv Detail & Related papers (2023-10-26T13:00:26Z) - Emotionally Numb or Empathetic? Evaluating How LLMs Feel Using EmotionBench [83.41621219298489]
We evaluate Large Language Models' (LLMs) anthropomorphic capabilities using the emotion appraisal theory from psychology.
We collect a dataset containing over 400 situations that have proven effective in eliciting the eight emotions central to our study.
We conduct a human evaluation involving more than 1,200 subjects worldwide.
arXiv Detail & Related papers (2023-08-07T15:18:30Z) - Large Language Models Understand and Can be Enhanced by Emotional
Stimuli [53.53886609012119]
We take the first step towards exploring the ability of Large Language Models to understand emotional stimuli.
Our experiments show that LLMs have a grasp of emotional intelligence, and their performance can be improved with emotional prompts.
Our human study results demonstrate that EmotionPrompt significantly boosts the performance of generative tasks.
arXiv Detail & Related papers (2023-07-14T00:57:12Z) - Inducing anxiety in large language models can induce bias [47.85323153767388]
We focus on twelve established large language models (LLMs) and subject them to a questionnaire commonly used in psychiatry.
Our results show that six of the latest LLMs respond robustly to the anxiety questionnaire, producing comparable anxiety scores to humans.
Anxiety-induction not only influences LLMs' scores on an anxiety questionnaire but also influences their behavior in a previously-established benchmark measuring biases such as racism and ageism.
arXiv Detail & Related papers (2023-04-21T16:29:43Z) - Handwriting and Drawing for Depression Detection: A Preliminary Study [53.11777541341063]
Short-term covid effects on mental health were a significant increase in anxiety and depressive symptoms.
The aim of this study is to use a new tool, the online handwriting and drawing analysis, to discriminate between healthy individuals and depressed patients.
arXiv Detail & Related papers (2023-02-05T22:33:49Z) - MIME: MIMicking Emotions for Empathetic Response Generation [82.57304533143756]
Current approaches to empathetic response generation view the set of emotions expressed in the input text as a flat structure.
We argue that empathetic responses often mimic the emotion of the user to a varying degree, depending on its positivity or negativity and content.
arXiv Detail & Related papers (2020-10-04T00:35:47Z) - Depressed individuals express more distorted thinking on social media [0.0]
Depression is a leading cause of disability worldwide, but is often under-diagnosed and under-treated.
Here, we show that individuals with a self-reported diagnosis of depression express higher levels of distorted thinking than a random sample.
Some types of distorted thinking were found to be more than twice as prevalent in our depressed cohort, in particular Personalizing and Emotional Reasoning.
arXiv Detail & Related papers (2020-02-07T14:18:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.