mhGPT: A Lightweight Generative Pre-Trained Transformer for Mental Health Text Analysis
- URL: http://arxiv.org/abs/2408.08261v1
- Date: Thu, 15 Aug 2024 17:01:57 GMT
- Title: mhGPT: A Lightweight Generative Pre-Trained Transformer for Mental Health Text Analysis
- Authors: Dae-young Kim, Rebecca Hwa, Muhammad Mahbubur Rahman,
- Abstract summary: This paper introduces mhGPT, a lightweight generative pre-trained transformer trained on mental health-related social media and PubMed articles.
mhGPT was evaluated under limited hardware constraints and compared with state-of-the-art models like MentaLLaMA and Gemma.
- Score: 8.654701704101779
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces mhGPT, a lightweight generative pre-trained transformer trained on mental health-related social media and PubMed articles. Fine-tuned for specific mental health tasks, mhGPT was evaluated under limited hardware constraints and compared with state-of-the-art models like MentaLLaMA and Gemma. Despite having only 1.98 billion parameters and using just 5% of the dataset, mhGPT outperformed larger models and matched the performance of models trained on significantly more data. The key contributions include integrating diverse mental health data, creating a custom tokenizer, and optimizing a smaller architecture for low-resource settings. This research could advance AI-driven mental health care, especially in areas with limited computing power.
Related papers
- Menta: A Small Language Model for On-Device Mental Health Prediction [19.94525754933305]
We introduce Menta, the first optimized SLM fine-tuned specifically for multi-task mental health prediction from social media data.<n>Menta is jointly trained across six classification tasks using a LoRA-based framework, a cross-dataset strategy, and a balanced accuracy--oriented loss.<n>We demonstrate real-time, on-device deployment of Menta on an iPhone 15 Pro Max, requiring only approximately 3GB RAM.
arXiv Detail & Related papers (2025-12-02T12:47:08Z) - Mental Multi-class Classification on Social Media: Benchmarking Transformer Architectures against LSTM Models [7.464241214592479]
We present a large-scale comparative study of state-of-the-art transformer versus Long Short-Term Memory (LSTM)-based models to classify mental health posts.<n>We first curate a large dataset of Reddit posts spanning six mental health conditions and a control group, using rigorous filtering and statistical exploratory analysis to ensure annotation quality.<n> Experimental results show that transformer models consistently outperform the alternatives, with RoBERTa achieving 91-99% F1-scores and accuracies across all classes.
arXiv Detail & Related papers (2025-09-20T05:41:59Z) - Advancing Mental Disorder Detection: A Comparative Evaluation of Transformer and LSTM Architectures on Social Media [0.16385815610837165]
This study provides a comprehensive evaluation of state-of-the-art transformer models against Long Short-Term Memory (LSTM) based approaches.<n>We construct a large annotated dataset using different text embedding techniques for mental health disorder classification on Reddit.<n> Experimental results demonstrate the superior performance of transformer models over traditional deep-learning approaches.
arXiv Detail & Related papers (2025-07-17T04:58:31Z) - MedGemma Technical Report [75.88152277443179]
We introduce MedGemma, a collection of medical vision-language foundation models based on Gemma 3 4B and 27B.<n>MedGemma demonstrates advanced medical understanding and reasoning on images and text.<n>We additionally introduce MedSigLIP, a medically-tuned vision encoder derived from SigLIP.
arXiv Detail & Related papers (2025-07-07T17:01:44Z) - EEG Foundation Challenge: From Cross-Task to Cross-Subject EEG Decoding [71.31963197992998]
We introduce a large-scale, code-based competition comprising two challenges.<n>The Transfer Challenge asks participants to build and test a model that can zero-shot decode new tasks and new subjects from their EEG data.<n>The Psychopathology factor prediction Challenge asks participants to infer subject measures of mental health from EEG data.
arXiv Detail & Related papers (2025-06-23T21:25:19Z) - MoodAngels: A Retrieval-augmented Multi-agent Framework for Psychiatry Diagnosis [58.67342568632529]
MoodAngels is the first specialized multi-agent framework for mood disorder diagnosis.<n>MoodSyn is an open-source dataset of 1,173 synthetic psychiatric cases.
arXiv Detail & Related papers (2025-06-04T09:18:25Z) - Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling [50.83055329849865]
PsyLLM is a large language model designed to integrate diagnostic and therapeutic reasoning for mental health counseling.<n>It processes real-world mental health posts from Reddit and generates multi-turn dialogue structures.<n>Our experiments demonstrate that PsyLLM significantly outperforms state-of-the-art baseline models.
arXiv Detail & Related papers (2025-05-21T16:24:49Z) - AI Foundation Models for Wearable Movement Data in Mental Health Research [2.015440876410741]
We introduce the Pretrained Actigraphy Transformer (PAT), the first open source foundation model designed for time-series wearable movement data.
PAT achieves state-of-the-art performance in several mental health prediction tasks.
arXiv Detail & Related papers (2024-11-22T01:58:35Z) - Enhancing PTSD Outcome Prediction with Ensemble Models in Disaster Contexts [0.9249657468385778]
Post-traumatic stress disorder (PTSD) is a significant mental health challenge that affects individuals exposed to traumatic events.
Early detection and effective intervention for PTSD are crucial, as it can lead to long-term psychological distress if untreated.
arXiv Detail & Related papers (2024-11-16T01:44:43Z) - Neuron-based Personality Trait Induction in Large Language Models [115.08894603023712]
Large language models (LLMs) have become increasingly proficient at simulating various personality traits.
We present a neuron-based approach for personality trait induction in LLMs.
arXiv Detail & Related papers (2024-10-16T07:47:45Z) - MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders [59.515827458631975]
Mental health disorders are one of the most serious diseases in the world.
Privacy concerns limit the accessibility of personalized treatment data.
MentalArena is a self-play framework to train language models.
arXiv Detail & Related papers (2024-10-09T13:06:40Z) - Advancing Mental Health Pre-Screening: A New Custom GPT for Psychological Distress Assessment [0.8287206589886881]
'Psycho Analyst' is a custom GPT model based on OpenAI's GPT-4, optimized for pre-screening mental health disorders.
The model adeptly decodes nuanced linguistic indicators of mental health disorders.
arXiv Detail & Related papers (2024-08-03T00:38:30Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - MentaLLaMA: Interpretable Mental Health Analysis on Social Media with
Large Language Models [28.62967557368565]
We build the first multi-task and multi-source interpretable mental health instruction dataset on social media, with 105K data samples.
We use expert-written few-shot prompts and collected labels to prompt ChatGPT and obtain explanations from its responses.
Based on the IMHI dataset and LLaMA2 foundation models, we train MentalLLaMA, the first open-source LLM series for interpretable mental health analysis.
arXiv Detail & Related papers (2023-09-24T06:46:08Z) - Harnessing the Power of Hugging Face Transformers for Predicting Mental
Health Disorders in Social Networks [0.0]
This study explores how user-generated data can be used to predict mental disorder symptoms.
Our study compares four different BERT models of Hugging Face with standard machine learning techniques.
New models outperform the previous approach with an accuracy rate of up to 97%.
arXiv Detail & Related papers (2023-06-29T12:25:19Z) - Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach.
Our approach is easy to integrate into any hybrid model and requires no external training data.
Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z) - Parameter-Efficient Sparsity for Large Language Models Fine-Tuning [63.321205487234074]
We propose a.
sparse-efficient Sparse Training (PST) method to reduce the number of trainable parameters during sparse-aware training.
Experiments with diverse networks (i.e., BERT, RoBERTa and GPT-2) demonstrate PST performs on par or better than previous sparsity methods.
arXiv Detail & Related papers (2022-05-23T02:43:45Z) - SANSformers: Self-Supervised Forecasting in Electronic Health Records
with Attention-Free Models [48.07469930813923]
This work aims to forecast the demand for healthcare services, by predicting the number of patient visits to healthcare facilities.
We introduce SANSformer, an attention-free sequential model designed with specific inductive biases to cater for the unique characteristics of EHR data.
Our results illuminate the promising potential of tailored attention-free models and self-supervised pretraining in refining healthcare utilization predictions across various patient demographics.
arXiv Detail & Related papers (2021-08-31T08:23:56Z) - Using Convolutional Variational Autoencoders to Predict Post-Trauma
Health Outcomes from Actigraphy Data [4.668948267866486]
Depression and post-traumatic stress disorder (PTSD) are psychiatric conditions commonly associated with a traumatic event.
In this work, we used locomotor activity captured from 1113 individuals who wore a research grade smartwatch post-trauma.
A convolutional variational autoencoder (VAE) architecture was used for unsupervised feature extraction from actigraphy data.
arXiv Detail & Related papers (2020-11-14T22:48:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.