Related papers: A Machine Learning Approach for Detection of Mental Health Conditions and Cyberbullying from Social Media

A Machine Learning Approach for Detection of Mental Health Conditions and Cyberbullying from Social Media

URL: http://arxiv.org/abs/2511.20001v2
Date: Mon, 01 Dec 2025 11:07:35 GMT
Title: A Machine Learning Approach for Detection of Mental Health Conditions and Cyberbullying from Social Media
Authors: Edward Ajayi, Martha Kachweka, Mawuli Deku, Emily Aiken,
Abstract summary: Mental health challenges and cyberbullying are increasingly prevalent in digital spaces.<n>This paper introduces a unified multiclass classification framework for detecting ten distinct mental health and cyberbullying categories from social media data.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Mental health challenges and cyberbullying are increasingly prevalent in digital spaces, necessitating scalable and interpretable detection systems. This paper introduces a unified multiclass classification framework for detecting ten distinct mental health and cyberbullying categories from social media data. We curate datasets from Twitter and Reddit, implementing a rigorous "split-then-balance" pipeline to train on balanced data while evaluating on a realistic, held-out imbalanced test set. We conducted a comprehensive evaluation comparing traditional lexical models, hybrid approaches, and several end-to-end fine-tuned transformers. Our results demonstrate that end-to-end fine-tuning is critical for performance, with the domain-adapted MentalBERT emerging as the top model, achieving an accuracy of 0.92 and a Macro F1 score of 0.76, surpassing both its generic counterpart and a zero-shot LLM baseline. Grounded in a comprehensive ethical analysis, we frame the system as a human-in-the-loop screening aid, not a diagnostic tool. To support this, we introduce a hybrid SHAPLLM explainability framework and present a prototype dashboard ("Social Media Screener") designed to integrate model predictions and their explanations into a practical workflow for moderators. Our work provides a robust baseline, highlighting future needs for multi-label, clinically-validated datasets at the critical intersection of online safety and computational mental health.

Related papers

Mental Multi-class Classification on Social Media: Benchmarking Transformer Architectures against LSTM Models [7.464241214592479]
We present a large-scale comparative study of state-of-the-art transformer versus Long Short-Term Memory (LSTM)-based models to classify mental health posts.<n>We first curate a large dataset of Reddit posts spanning six mental health conditions and a control group, using rigorous filtering and statistical exploratory analysis to ensure annotation quality.<n> Experimental results show that transformer models consistently outperform the alternatives, with RoBERTa achieving 91-99% F1-scores and accuracies across all classes.
arXiv Detail & Related papers (2025-09-20T05:41:59Z)
Advancing Mental Disorder Detection: A Comparative Evaluation of Transformer and LSTM Architectures on Social Media [0.16385815610837165]
This study provides a comprehensive evaluation of state-of-the-art transformer models against Long Short-Term Memory (LSTM) based approaches.<n>We construct a large annotated dataset using different text embedding techniques for mental health disorder classification on Reddit.<n> Experimental results demonstrate the superior performance of transformer models over traditional deep-learning approaches.
arXiv Detail & Related papers (2025-07-17T04:58:31Z)
Latent Space Data Fusion Outperforms Early Fusion in Multimodal Mental Health Digital Phenotyping Data [0.0]
Mental illnesses such as depression and anxiety require improved methods for early detection and personalized intervention.<n>Traditional predictive models often rely on unimodal data or early fusion strategies that fail to capture the complex, multimodal nature of psychiatric data.<n>We evaluated intermediate (latent space) fusion for predicting daily depressive symptoms.
arXiv Detail & Related papers (2025-07-10T18:10:46Z)
MoodAngels: A Retrieval-augmented Multi-agent Framework for Psychiatry Diagnosis [58.67342568632529]
MoodAngels is the first specialized multi-agent framework for mood disorder diagnosis.<n>MoodSyn is an open-source dataset of 1,173 synthetic psychiatric cases.
arXiv Detail & Related papers (2025-06-04T09:18:25Z)
Leveraging Embedding Techniques in Multimodal Machine Learning for Mental Illness Assessment [0.8458496687170665]
The increasing global prevalence of mental disorders, such as depression and PTSD, requires objective and scalable diagnostic tools.<n>This paper investigates the potential of multimodal machine learning to address these challenges, leveraging the complementary information available in text, audio, and video data.<n>We explore data-level, feature-level, and decision-level fusion techniques, including a novel integration of Large Language Model predictions.
arXiv Detail & Related papers (2025-04-02T14:19:06Z)
Early Detection of Mental Health Issues Using Social Media Posts [0.0]
Social media platforms, like Reddit, represent a rich source of user-generated content.<n>We propose a multi-modal deep learning framework that integrates linguistic and temporal features for early detection of mental health crises.
arXiv Detail & Related papers (2025-03-06T23:08:08Z)
LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment [75.44934940580112]
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment.<n>We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews.<n>Our approach, tested on 236 real-world interviews, demonstrates strong correlations with clinician assessments.
arXiv Detail & Related papers (2025-01-07T08:49:04Z)
Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation. GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z)
Dynamic Bank Learning for Semi-supervised Federated Image Diagnosis with Class Imbalance [65.61909544178603]
We study a practical yet challenging problem of class imbalanced semi-supervised FL (imFed-Semi) This imFed-Semi problem is addressed by a novel dynamic bank learning scheme, which improves client training by exploiting class proportion information. We evaluate our approach on two public real-world medical datasets, including the intracranial hemorrhage diagnosis with 25,000 CT slices and skin lesion diagnosis with 10,015 dermoscopy images.
arXiv Detail & Related papers (2022-06-27T06:51:48Z)
Estimating and Improving Fairness with Adversarial Learning [65.99330614802388]
We propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system. Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model. We evaluate our framework on a large-scale public-available skin lesion dataset.
arXiv Detail & Related papers (2021-03-07T03:10:32Z)
UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model. UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data. We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD) UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.