MentalChat16K: A Benchmark Dataset for Conversational Mental Health Assistance
- URL: http://arxiv.org/abs/2503.13509v2
- Date: Mon, 02 Jun 2025 02:53:02 GMT
- Title: MentalChat16K: A Benchmark Dataset for Conversational Mental Health Assistance
- Authors: Jia Xu, Tianyi Wei, Bojian Hou, Patryk Orzechowski, Shu Yang, Ruochen Jin, Rachael Paulbeck, Joost Wagenaar, George Demiris, Li Shen,
- Abstract summary: MentalChat16K is an English benchmark dataset combining a synthetic mental health counseling dataset and a dataset of anonymized transcripts.<n>This curated dataset is designed to facilitate the development and evaluation of large language models for conversational mental health assistance.<n>The dataset prioritizes patient privacy, ethical considerations, and responsible data usage.
- Score: 13.373260490163709
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We introduce MentalChat16K, an English benchmark dataset combining a synthetic mental health counseling dataset and a dataset of anonymized transcripts from interventions between Behavioral Health Coaches and Caregivers of patients in palliative or hospice care. Covering a diverse range of conditions like depression, anxiety, and grief, this curated dataset is designed to facilitate the development and evaluation of large language models for conversational mental health assistance. By providing a high-quality resource tailored to this critical domain, MentalChat16K aims to advance research on empathetic, personalized AI solutions to improve access to mental health support services. The dataset prioritizes patient privacy, ethical considerations, and responsible data usage. MentalChat16K presents a valuable opportunity for the research community to innovate AI technologies that can positively impact mental well-being. The dataset is available at https://huggingface.co/datasets/ShenLab/MentalChat16K and the code and documentation are hosted on GitHub at https://github.com/ChiaPatricia/MentalChat16K.
Related papers
- Towards Privacy-aware Mental Health AI Models: Advances, Challenges, and Opportunities [61.633126163190724]
Mental illness is a widespread and debilitating condition with substantial societal and personal costs.<n>Recent advances in Artificial Intelligence (AI) hold great potential for recognizing and addressing conditions such as depression, anxiety disorder, bipolar disorder, schizophrenia, and post-traumatic stress disorder.<n>Privacy concerns, including the risk of sensitive data leakage from datasets and trained models, remain a critical barrier to deploying these AI systems in real-world clinical settings.
arXiv Detail & Related papers (2025-02-01T15:10:02Z) - Understanding Student Sentiment on Mental Health Support in Colleges Using Large Language Models [5.3204794327005205]
This paper uses public Student Voice Survey data to analyze student sentiments on mental health support with large language models (LLMs)<n>The investigation of both traditional machine learning methods and state-of-the-art LLMs showed the best performance of GPT-3.5 and BERT on this new dataset.
arXiv Detail & Related papers (2024-11-18T02:53:15Z) - ConvCounsel: A Conversational Dataset for Student Counseling [31.298840947078364]
This paper introduces a specialized mental health dataset that emphasizes the active listening strategy employed in conversation for counseling, also named as ConvCounsel.
To demonstrate the utility of the proposed dataset, this paper also presents the NYCUKA, a spoken mental health dialogue system that is designed by using the ConvCounsel dataset.
arXiv Detail & Related papers (2024-11-01T14:08:02Z) - MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders [59.515827458631975]
Mental health disorders are one of the most serious diseases in the world.<n>Privacy concerns limit the accessibility of personalized treatment data.<n>MentalArena is a self-play framework to train language models.
arXiv Detail & Related papers (2024-10-09T13:06:40Z) - Enhancing Mental Health Support through Human-AI Collaboration: Toward Secure and Empathetic AI-enabled chatbots [0.0]
This paper explores the potential of AI-enabled chatbots as a scalable solution.
We assess their ability to deliver empathetic, meaningful responses in mental health contexts.
We propose a federated learning framework that ensures data privacy, reduces bias, and integrates continuous validation from clinicians to enhance response quality.
arXiv Detail & Related papers (2024-09-17T20:49:13Z) - Enhancing AI-Driven Psychological Consultation: Layered Prompts with Large Language Models [44.99833362998488]
We explore the use of large language models (LLMs) like GPT-4 to augment psychological consultation services.
Our approach introduces a novel layered prompting system that dynamically adapts to user input.
We also develop empathy-driven and scenario-based prompts to enhance the LLM's emotional intelligence.
arXiv Detail & Related papers (2024-08-29T05:47:14Z) - LLM Questionnaire Completion for Automatic Psychiatric Assessment [49.1574468325115]
We employ a Large Language Model (LLM) to convert unstructured psychological interviews into structured questionnaires spanning various psychiatric and personality domains.
The obtained answers are coded as features, which are used to predict standardized psychiatric measures of depression (PHQ-8) and PTSD (PCL-C)
arXiv Detail & Related papers (2024-06-09T09:03:11Z) - MentalQA: An Annotated Arabic Corpus for Questions and Answers of Mental Healthcare [0.1638581561083717]
MentalQA is a novel Arabic dataset featuring conversational-style question-and-answer (QA) interactions.
Data was collected from a question-answering medical platform.
MentalQA offers a valuable foundation for developing Arabic text mining tools capable of supporting mental health professionals and individuals seeking information.
arXiv Detail & Related papers (2024-05-21T09:16:38Z) - An Annotated Dataset for Explainable Interpersonal Risk Factors of
Mental Disturbance in Social Media Posts [0.0]
We construct and release a new annotated dataset with human-labelled explanations and classification of Interpersonal Risk Factors (IRF) affecting mental disturbance on social media.
We establish baseline models on our dataset facilitating future research directions to develop real-time personalized AI models by detecting patterns of TBe and PBu in emotional spectrum of user's historical social media profile.
arXiv Detail & Related papers (2023-05-30T04:08:40Z) - GDPR Compliant Collection of Therapist-Patient-Dialogues [48.091760741427656]
We elaborate on the challenges we faced in starting our collection of therapist-patient dialogues in a psychiatry clinic under the General Data Privacy Regulation of the European Union.
We give an overview of each step in our procedure and point out the potential pitfalls to motivate further research in this field.
arXiv Detail & Related papers (2022-11-22T15:51:10Z) - Mental Illness Classification on Social Media Texts using Deep Learning
and Transfer Learning [55.653944436488786]
According to the World health organization (WHO), approximately 450 million people are affected.
Mental illnesses, such as depression, anxiety, bipolar disorder, ADHD, and PTSD.
This study analyzes unstructured user data on Reddit platform and classifies five common mental illnesses: depression, anxiety, bipolar disorder, ADHD, and PTSD.
arXiv Detail & Related papers (2022-07-03T11:33:52Z) - Learning Language and Multimodal Privacy-Preserving Markers of Mood from
Mobile Data [74.60507696087966]
Mental health conditions remain underdiagnosed even in countries with common access to advanced medical care.
One promising data source to help monitor human behavior is daily smartphone usage.
We study behavioral markers of daily mood using a recent dataset of mobile behaviors from adolescent populations at high risk of suicidal behaviors.
arXiv Detail & Related papers (2021-06-24T17:46:03Z) - MET: Multimodal Perception of Engagement for Telehealth [52.54282887530756]
We present MET, a learning-based algorithm for perceiving a human's level of engagement from videos.
We release a new dataset, MEDICA, for mental health patient engagement detection.
arXiv Detail & Related papers (2020-11-17T15:18:38Z) - Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset
for Personality Assessment [50.15466026089435]
We present a novel peer-to-peer Hindi conversation dataset- Vyaktitv.
It consists of high-quality audio and video recordings of the participants, with Hinglish textual transcriptions for each conversation.
The dataset also contains a rich set of socio-demographic features, like income, cultural orientation, amongst several others, for all the participants.
arXiv Detail & Related papers (2020-08-31T17:44:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.