Related papers: Beyond Architectures: Evaluating the Role of Contextual Embeddings in Detecting Bipolar Disorder on Social Media

Beyond Architectures: Evaluating the Role of Contextual Embeddings in Detecting Bipolar Disorder on Social Media

URL: http://arxiv.org/abs/2507.14231v1
Date: Thu, 17 Jul 2025 05:14:19 GMT
Title: Beyond Architectures: Evaluating the Role of Contextual Embeddings in Detecting Bipolar Disorder on Social Media
Authors: Khalid Hasan, Jamil Saquer,
Abstract summary: bipolar disorder is a chronic mental illness frequently underdiagnosed due to subtle early symptoms and social stigma.<n>This paper explores the advanced natural language processing (NLP) models for recognizing signs of bipolar disorder based on user-generated social media text.
Score: 0.18416014644193066
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bipolar disorder is a chronic mental illness frequently underdiagnosed due to subtle early symptoms and social stigma. This paper explores the advanced natural language processing (NLP) models for recognizing signs of bipolar disorder based on user-generated social media text. We conduct a comprehensive evaluation of transformer-based models (BERT, RoBERTa, ALBERT, ELECTRA, DistilBERT) and Long Short Term Memory (LSTM) models based on contextualized (BERT) and static (GloVe, Word2Vec) word embeddings. Experiments were performed on a large, annotated dataset of Reddit posts after confirming their validity through sentiment variance and judgmental analysis. Our results demonstrate that RoBERTa achieves the highest performance among transformer models with an F1 score of ~98% while LSTM models using BERT embeddings yield nearly identical results. In contrast, LSTMs trained on static embeddings fail to capture meaningful patterns, scoring near-zero F1. These findings underscore the critical role of contextual language modeling in detecting bipolar disorder. In addition, we report model training times and highlight that DistilBERT offers an optimal balance between efficiency and accuracy. In general, our study offers actionable insights for model selection in mental health NLP applications and validates the potential of contextualized language models to support early bipolar disorder screening.

Related papers

Advancing Mental Disorder Detection: A Comparative Evaluation of Transformer and LSTM Architectures on Social Media [0.16385815610837165]
This study provides a comprehensive evaluation of state-of-the-art transformer models against Long Short-Term Memory (LSTM) based approaches.<n>We construct a large annotated dataset using different text embedding techniques for mental health disorder classification on Reddit.<n> Experimental results demonstrate the superior performance of transformer models over traditional deep-learning approaches.
arXiv Detail & Related papers (2025-07-17T04:58:31Z)
LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment [75.44934940580112]
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment.<n>We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews.<n>Our approach, tested on 236 real-world interviews, demonstrates strong correlations with clinician assessments.
arXiv Detail & Related papers (2025-01-07T08:49:04Z)
Generative causal testing to bridge data-driven models and scientific theories in language neuroscience [82.995061475971]
We present generative causal testing (GCT), a framework for generating concise explanations of language selectivity in the brain.<n>We show that GCT can dissect fine-grained differences between brain areas with similar functional selectivity.
arXiv Detail & Related papers (2024-10-01T15:57:48Z)
WellDunn: On the Robustness and Explainability of Language Models and Large Language Models in Identifying Wellness Dimensions [46.60244609728416]
Language Models (LMs) are being proposed for mental health applications where the heightened risk of adverse outcomes means predictive performance may not be a litmus test of a model's utility in clinical practice. We introduce an evaluation design that focuses on the robustness and explainability of LMs in identifying Wellness Dimensions (WDs) We reveal four surprising results about LMs/LLMs.
arXiv Detail & Related papers (2024-06-17T19:50:40Z)
Question-Answering Model for Schizophrenia Symptoms and Their Impact on Daily Life using Mental Health Forums Data [0.0]
The Mental Health'' forum was used, a forum dedicated to people suffering from schizophrenia and different mental disorders. It is shown how to pre-process the dataset to convert it into a QA dataset. The BiBERT, DistilBERT, RoBERTa, and BioBERT models were fine-tuned and evaluated via F1-Score, Exact Match, Precision and Recall.
arXiv Detail & Related papers (2023-09-30T17:50:50Z)
TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment. In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials. We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z)
Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods [0.21079694661943607]
We compare three methods for negation detection in Dutch clinical notes. We found that both the biLSTM and RoBERTa models consistently outperform the rule-based model in terms of F1 score, precision and recall.
arXiv Detail & Related papers (2022-09-01T14:00:13Z)
Prediction of Depression Severity Based on the Prosodic and Semantic Features with Bidirectional LSTM and Time Distributed CNN [14.994852548758825]
We propose an attention-based multimodality speech and text representation for depression prediction. Our model is trained to estimate the depression severity of participants using the Distress Analysis Interview Corpus-Wizard of Oz dataset. Experiments show statistically significant improvements over previous works.
arXiv Detail & Related papers (2022-02-25T01:42:29Z)
Fine-Tuning Large Neural Language Models for Biomedical Natural Language Processing [55.52858954615655]
We conduct a systematic study on fine-tuning stability in biomedical NLP. We show that finetuning performance may be sensitive to pretraining settings, especially in low-resource domains. We show that these techniques can substantially improve fine-tuning performance for lowresource biomedical NLP applications.
arXiv Detail & Related papers (2021-12-15T04:20:35Z)
Modelling Paralinguistic Properties in Conversational Speech to Detect Bipolar Disorder and Borderline Personality Disorder [14.766941144375146]
We propose a new approach of modelling short-term features with visibility-signature transform. We show the role of different sets of features in characterising BD and BPD.
arXiv Detail & Related papers (2021-02-18T20:47:03Z)
Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community. We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence. We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.