RSD-15K: A Large-Scale User-Level Annotated Dataset for Suicide Risk Detection on Social Media
- URL: http://arxiv.org/abs/2507.11559v1
- Date: Mon, 14 Jul 2025 09:26:26 GMT
- Title: RSD-15K: A Large-Scale User-Level Annotated Dataset for Suicide Risk Detection on Social Media
- Authors: Shouwen Zheng, Yingzhi Tao, Taiqi Zhou,
- Abstract summary: Social media is an important platform for individuals to express emotions and seek help.<n>This paper introduces a large-scale dataset containing 15,000 user-level posts.<n>Compared with existing datasets, this dataset retains complete user posting time sequence information.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, cognitive and mental health (CMH) disorders have increasingly become an important challenge for global public health, especially the suicide problem caused by multiple factors such as social competition, economic pressure and interpersonal relationships among young and middle-aged people. Social media, as an important platform for individuals to express emotions and seek help, provides the possibility for early detection and intervention of suicide risk. This paper introduces a large-scale dataset containing 15,000 user-level posts. Compared with existing datasets, this dataset retains complete user posting time sequence information, supports modeling the dynamic evolution of suicide risk, and we have also conducted comprehensive and rigorous annotations on these datasets. In the benchmark experiment, we systematically evaluated the performance of traditional machine learning methods, deep learning models, and fine-tuned large language models. The experimental results show that our dataset can effectively support the automatic assessment task of suicide risk. Considering the sensitivity of mental health data, we also discussed the privacy protection and ethical use of the dataset. In addition, we also explored the potential applications of the dataset in mental health testing, clinical psychiatric auxiliary treatment, etc., and provided directional suggestions for future research work.
Related papers
- Towards Privacy-aware Mental Health AI Models: Advances, Challenges, and Opportunities [61.633126163190724]
Mental illness is a widespread and debilitating condition with substantial societal and personal costs.<n>Recent advances in Artificial Intelligence (AI) hold great potential for recognizing and addressing conditions such as depression, anxiety disorder, bipolar disorder, schizophrenia, and post-traumatic stress disorder.<n>Privacy concerns, including the risk of sensitive data leakage from datasets and trained models, remain a critical barrier to deploying these AI systems in real-world clinical settings.
arXiv Detail & Related papers (2025-02-01T15:10:02Z) - Digital Phenotyping for Adolescent Mental Health: A Feasibility Study Employing Machine Learning to Predict Mental Health Risk From Active and Passive Smartphone Data [2.2310516973117194]
This study evaluated the feasibility of integrating active and passive smartphone data to predict mental disorders in non-clinical adolescents.<n>We investigated the Mindcraft app in predicting risks for internalising and externalising disorders, eating disorders, insomnia and suicidal ideation.
arXiv Detail & Related papers (2025-01-15T15:05:49Z) - TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [54.98321887435557]
This paper presents a suite of 23 meticulously curated AI-ready datasets covering multi-modal input features and 8 crucial prediction challenges in clinical trial design.<n>We provide basic validation methods for each task to ensure the datasets' usability and reliability.<n>We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z) - SOS-1K: A Fine-grained Suicide Risk Classification Dataset for Chinese Social Media Analysis [22.709733830774788]
This study presents a Chinese social media dataset designed for fine-grained suicide risk classification.
Seven pre-trained models were evaluated in two tasks: high and low suicide risk, and fine-grained suicide risk classification on a level of 0 to 10.
Deep learning models show good performance in distinguishing between high and low suicide risk, with the best model achieving an F1 score of 88.39%.
arXiv Detail & Related papers (2024-04-19T06:58:51Z) - Non-Invasive Suicide Risk Prediction Through Speech Analysis [74.8396086718266]
We present a non-invasive, speech-based approach for automatic suicide risk assessment.
We extract three sets of features, including wav2vec, interpretable speech and acoustic features, and deep learning-based spectral representations.
Our most effective speech model achieves a balanced accuracy of $66.2,%$.
arXiv Detail & Related papers (2024-04-18T12:33:57Z) - Exploration of Adolescent Depression Risk Prediction Based on Census
Surveys and General Life Issues [7.774933303698165]
The prevalence of depression among adolescents is steadily increasing.
Traditional diagnostic methods, which rely on scales or interviews, prove particularly inadequate for detecting depression in young people.
We introduce a method for managing severely imbalanced high-dimensional data and an adaptive predictive approach tailored to data structure characteristics.
arXiv Detail & Related papers (2024-01-06T09:14:25Z) - Conceptualizing Suicidal Behavior: Utilizing Explanations of Predicted
Outcomes to Analyze Longitudinal Social Media Data [2.76101452577748]
The COVID-19 pandemic has escalated mental health crises worldwide.
Suicide can result from social factors such as shame, abuse, abandonment, and mental health conditions like depression.
As these conditions develop, signs of suicidal ideation may manifest in social media interactions.
arXiv Detail & Related papers (2023-12-13T17:15:12Z) - An Annotated Dataset for Explainable Interpersonal Risk Factors of
Mental Disturbance in Social Media Posts [0.0]
We construct and release a new annotated dataset with human-labelled explanations and classification of Interpersonal Risk Factors (IRF) affecting mental disturbance on social media.
We establish baseline models on our dataset facilitating future research directions to develop real-time personalized AI models by detecting patterns of TBe and PBu in emotional spectrum of user's historical social media profile.
arXiv Detail & Related papers (2023-05-30T04:08:40Z) - Learning Language and Multimodal Privacy-Preserving Markers of Mood from
Mobile Data [74.60507696087966]
Mental health conditions remain underdiagnosed even in countries with common access to advanced medical care.
One promising data source to help monitor human behavior is daily smartphone usage.
We study behavioral markers of daily mood using a recent dataset of mobile behaviors from adolescent populations at high risk of suicidal behaviors.
arXiv Detail & Related papers (2021-06-24T17:46:03Z) - Epidemic mitigation by statistical inference from contact tracing data [61.04165571425021]
We develop Bayesian inference methods to estimate the risk that an individual is infected.
We propose to use probabilistic risk estimation in order to optimize testing and quarantining strategies for the control of an epidemic.
Our approaches translate into fully distributed algorithms that only require communication between individuals who have recently been in contact.
arXiv Detail & Related papers (2020-09-20T12:24:45Z) - Anxiety Detection Leveraging Mobile Passive Sensing [53.11661460916551]
Anxiety disorders are the most common class of psychiatric problems affecting both children and adults.
Leveraging passive and unobtrusive data collection from smartphones could be a viable alternative to classical methods.
eWellness is an experimental mobile application designed to track a full-suite of sensor and user-log data off an individual's device in a continuous and passive manner.
arXiv Detail & Related papers (2020-08-09T20:22:52Z) - COVI White Paper [67.04578448931741]
Contact tracing is an essential tool to change the course of the Covid-19 pandemic.
We present an overview of the rationale, design, ethical considerations and privacy strategy of COVI,' a Covid-19 public peer-to-peer contact tracing and risk awareness mobile application developed in Canada.
arXiv Detail & Related papers (2020-05-18T07:40:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.