From Reddit to Generative AI: Evaluating Large Language Models for Anxiety Support Fine-tuned on Social Media Data
- URL: http://arxiv.org/abs/2505.18464v1
- Date: Sat, 24 May 2025 02:07:32 GMT
- Title: From Reddit to Generative AI: Evaluating Large Language Models for Anxiety Support Fine-tuned on Social Media Data
- Authors: Ugur Kursuncu, Trilok Padhi, Gaurav Sinha, Abdulkadir Erol, Jaya Krishna Mandivarapu, Christopher R. Larrison,
- Abstract summary: This study presents a systematic evaluation of Large Language Models (LLMs) for their potential utility in anxiety support.<n>Our approach utilizes a mixed-method evaluation framework incorporating three main categories of criteria: (i) linguistic quality, (ii) safety and trustworthiness, and (iii) supportiveness.<n>Results show that fine-tuning LLMs with naturalistic anxiety-related data enhanced linguistic quality but increased toxicity and bias, and diminished emotional responsiveness.
- Score: 0.931556339267682
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The growing demand for accessible mental health support, compounded by workforce shortages and logistical barriers, has led to increased interest in utilizing Large Language Models (LLMs) for scalable and real-time assistance. However, their use in sensitive domains such as anxiety support remains underexamined. This study presents a systematic evaluation of LLMs (GPT and Llama) for their potential utility in anxiety support by using real user-generated posts from the r/Anxiety subreddit for both prompting and fine-tuning. Our approach utilizes a mixed-method evaluation framework incorporating three main categories of criteria: (i) linguistic quality, (ii) safety and trustworthiness, and (iii) supportiveness. Results show that fine-tuning LLMs with naturalistic anxiety-related data enhanced linguistic quality but increased toxicity and bias, and diminished emotional responsiveness. While LLMs exhibited limited empathy, GPT was evaluated as more supportive overall. Our findings highlight the risks of fine-tuning LLMs on unprocessed social media content without mitigation strategies.
Related papers
- Reframe Your Life Story: Interactive Narrative Therapist and Innovative Moment Assessment with Large Language Models [92.93521294357058]
Narrative therapy helps individuals transform problematic life stories into empowering alternatives.<n>Current approaches lack realism in specialized psychotherapy and fail to capture therapeutic progression over time.<n>Int (Interactive Narrative Therapist) simulates expert narrative therapists by planning therapeutic stages, guiding reflection levels, and generating contextually appropriate expert-like responses.
arXiv Detail & Related papers (2025-07-27T11:52:09Z) - Large Language Model-Powered Conversational Agent Delivering Problem-Solving Therapy (PST) for Family Caregivers: Enhancing Empathy and Therapeutic Alliance Using In-Context Learning [3.5944459851781057]
Family caregivers often face substantial mental health challenges.<n>This study explored the potential of a large language model (LLM)-powered conversational agent to deliver evidence-based mental health support.
arXiv Detail & Related papers (2025-06-13T00:47:57Z) - "Is This Really a Human Peer Supporter?": Misalignments Between Peer Supporters and Experts in LLM-Supported Interactions [5.481575506447599]
Mental health is a growing global concern, prompting interest in AI-driven solutions to expand access to psychosocial support.<n>LLMs present new opportunities to enhance peer support interactions, particularly in real-time, text-based interactions.<n>We present and evaluate an AI-supported system with an LLM-simulated distressed client, context-sensitive LLM-generated suggestions, and real-time emotion visualisations.
arXiv Detail & Related papers (2025-06-11T03:06:41Z) - Cognitive Debiasing Large Language Models for Decision-Making [71.2409973056137]
Large language models (LLMs) have shown potential in supporting decision-making applications.<n>We propose a cognitive debiasing approach, self-adaptive cognitive debiasing (SACD)<n>Our method follows three sequential steps -- bias determination, bias analysis, and cognitive debiasing -- to iteratively mitigate potential cognitive biases in prompts.
arXiv Detail & Related papers (2025-04-05T11:23:05Z) - LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment [75.44934940580112]
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment.<n>We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews.<n>Our approach, tested on 236 real-world interviews, demonstrates strong correlations with clinician assessments.
arXiv Detail & Related papers (2025-01-07T08:49:04Z) - Persuasion with Large Language Models: a Survey [49.86930318312291]
Large Language Models (LLMs) have created new disruptive possibilities for persuasive communication.
In areas such as politics, marketing, public health, e-commerce, and charitable giving, such LLM Systems have already achieved human-level or even super-human persuasiveness.
Our survey suggests that the current and future potential of LLM-based persuasion poses profound ethical and societal risks.
arXiv Detail & Related papers (2024-11-11T10:05:52Z) - SouLLMate: An Application Enhancing Diverse Mental Health Support with Adaptive LLMs, Prompt Engineering, and RAG Techniques [9.146311285410631]
Mental health issues significantly impact individuals' daily lives, yet many do not receive the help they need even with available online resources.
This study aims to provide diverse, accessible, stigma-free, personalized, and real-time mental health support through cutting-edge AI technologies.
arXiv Detail & Related papers (2024-10-17T22:04:32Z) - Walking in Others' Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias [16.85625861663094]
Motivated by social psychology principles, we propose a novel strategy named textscPeT that inspires LLMs to integrate diverse human perspectives and self-regulate their responses.
Rigorous evaluations and ablation studies are conducted on two commercial LLMs and three open-source LLMs, revealing textscPeT's superiority in producing less harmful responses.
arXiv Detail & Related papers (2024-07-22T04:25:01Z) - Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance [73.19687314438133]
We study how reliance is affected by contextual features of an interaction.
We find that contextual characteristics significantly affect human reliance behavior.
Our results show that calibration and language quality alone are insufficient in evaluating the risks of human-LM interactions.
arXiv Detail & Related papers (2024-07-10T18:00:05Z) - Can AI Relate: Testing Large Language Model Response for Mental Health Support [23.97212082563385]
Large language models (LLMs) are already being piloted for clinical use in hospital systems like NYU Langone, Dana-Farber and the NHS.
We develop an evaluation framework for determining whether LLM response is a viable and ethical path forward for the automation of mental health treatment.
arXiv Detail & Related papers (2024-05-20T13:42:27Z) - ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming [64.86326523181553]
ALERT is a large-scale benchmark to assess safety based on a novel fine-grained risk taxonomy.
It aims to identify vulnerabilities, inform improvements, and enhance the overall safety of the language models.
arXiv Detail & Related papers (2024-04-06T15:01:47Z) - Inducing anxiety in large language models can induce bias [47.85323153767388]
We focus on twelve established large language models (LLMs) and subject them to a questionnaire commonly used in psychiatry.
Our results show that six of the latest LLMs respond robustly to the anxiety questionnaire, producing comparable anxiety scores to humans.
Anxiety-induction not only influences LLMs' scores on an anxiety questionnaire but also influences their behavior in a previously-established benchmark measuring biases such as racism and ageism.
arXiv Detail & Related papers (2023-04-21T16:29:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.