PsihoRo: Depression and Anxiety Romanian Text Corpus
- URL: http://arxiv.org/abs/2602.18324v2
- Date: Mon, 23 Feb 2026 13:49:57 GMT
- Title: PsihoRo: Depression and Anxiety Romanian Text Corpus
- Authors: Alexandra Ciobotaru, Ana-Maria Bucur, Liviu P. Dinu,
- Abstract summary: Psychological corpora in NLP are collections of texts used to analyze human psychology, emotions, and mental health.<n>We have created the first corpus for depression and anxiety in Romanian, by utilizing a form with 6 open-ended questions.<n>PsihoRo is a first step towards understanding and analyzing texts regarding the mental health of the Romanian population.
- Score: 46.53794328211006
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Psychological corpora in NLP are collections of texts used to analyze human psychology, emotions, and mental health. These texts allow researchers to study psychological constructs, detect mental health issues and analyze emotional language. However, mental health data can be difficult to collect correctly from social media, due to suppositions made by the collectors. A more pragmatic strategy involves gathering data through open-ended questions and then assessing this information with self-report screening surveys. This method was employed successfully for English, a language with a lot of psychological NLP resources. However, this cannot be stated for Romanian, which currently has no open-source mental health corpus. To address this gap, we have created the first corpus for depression and anxiety in Romanian, by utilizing a form with 6 open-ended questions along with the standardized PHQ-9 and GAD-7 screening questionnaires. Consisting of the texts of 205 respondents and although it may seem small, PsihoRo is a first step towards understanding and analyzing texts regarding the mental health of the Romanian population. We employ statistical analysis, text analysis using Romanian LIWC, emotion detection and topic modeling to show what are the most important features of this newly introduced resource to the NLP community.
Related papers
- Social Media for Mental Health: Data, Methods, and Findings [7.498939749404979]
This chapter studies the state-of-the-art research methodologies and findings on mental health challenges from the pervasive use of social media data.<n>Specifically, this chapter describes linguistic, visual, and emotional indicators expressed in user disclosures.<n>The main goal of this chapter is to show how this new source of data can be tapped to improve medical practice, provide timely support, and influence government or policymakers.
arXiv Detail & Related papers (2025-11-11T07:10:12Z) - Reddit is all you need: Authorship profiling for Romanian [49.1574468325115]
Authorship profiling is the process of identifying an author's characteristics based on their writings.<n>In this paper, we introduce a corpus of short texts in the Romanian language, annotated with certain author characteristic keywords.
arXiv Detail & Related papers (2024-10-13T16:27:31Z) - Towards Understanding Emotions for Engaged Mental Health Conversations [1.3654846342364306]
We are developing a system to perform passive emotion-sensing using a combination of keystroke dynamics and sentiment analysis.
The analysis of short text messages and keyboard typing patterns can provide emotion information that may be used to support both clients and responders.
arXiv Detail & Related papers (2024-06-17T01:27:15Z) - LLM Questionnaire Completion for Automatic Psychiatric Assessment [49.1574468325115]
We employ a Large Language Model (LLM) to convert unstructured psychological interviews into structured questionnaires spanning various psychiatric and personality domains.
The obtained answers are coded as features, which are used to predict standardized psychiatric measures of depression (PHQ-8) and PTSD (PCL-C)
arXiv Detail & Related papers (2024-06-09T09:03:11Z) - CASE: Efficient Curricular Data Pre-training for Building Assistive Psychology Expert Models [1.0840985826142429]
This study explores the use of Natural Language Processing (NLP) pipelines to analyze text data from online mental health forums used for consultations.
By analyzing forum posts, these pipelines can flag users who may require immediate professional attention.
Case-BERT demonstrates superior performance compared to existing methods, achieving an f1 score of 0.91 for Depression and 0.88 for Anxiety.
arXiv Detail & Related papers (2024-06-01T06:17:32Z) - PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for
Personality Detection [50.66968526809069]
We propose a novel personality detection method, called PsyCoT, which mimics the way individuals complete psychological questionnaires in a multi-turn dialogue manner.
Our experiments demonstrate that PsyCoT significantly improves the performance and robustness of GPT-3.5 in personality detection.
arXiv Detail & Related papers (2023-10-31T08:23:33Z) - PsyQA: A Chinese Dataset for Generating Long Counseling Text for Mental
Health Support [32.176949527607746]
We propose PsyQA, a Chinese dataset of psychological health support in the form of question and answer pair.
PsyQA is crawled from a Chinese mental health service platform, and contains 22K questions and 56K long and well-structured answers.
arXiv Detail & Related papers (2021-06-03T09:06:25Z) - Detecting Early Onset of Depression from Social Media Text using Learned
Confidence Scores [19.86148958828238]
Depression is the second leading cause of death for young adults.
In this work, we focus on methods for detecting the early onset of depression from social media texts.
arXiv Detail & Related papers (2020-11-03T13:34:04Z) - Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset
for Personality Assessment [50.15466026089435]
We present a novel peer-to-peer Hindi conversation dataset- Vyaktitv.
It consists of high-quality audio and video recordings of the participants, with Hinglish textual transcriptions for each conversation.
The dataset also contains a rich set of socio-demographic features, like income, cultural orientation, amongst several others, for all the participants.
arXiv Detail & Related papers (2020-08-31T17:44:28Z) - Pragmatic information in translation: a corpus-based study of tense and
mood in English and German [70.3497683558609]
Grammatical tense and mood are important linguistic phenomena to consider in natural language processing (NLP) research.
We consider the correspondence between English and German tense and mood in translation.
Of particular importance is the challenge of modeling tense and mood in rule-based, phrase-based statistical and neural machine translation.
arXiv Detail & Related papers (2020-07-10T08:15:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.