Multi-class Categorization of Reasons behind Mental Disturbance in Long
Texts
- URL: http://arxiv.org/abs/2304.04118v1
- Date: Sat, 8 Apr 2023 22:44:32 GMT
- Title: Multi-class Categorization of Reasons behind Mental Disturbance in Long
Texts
- Authors: Muskan Garg
- Abstract summary: We use Longformer to handle the problem of finding causal indicators behind mental illness in self-reported text.
Experiments show that Longformer achieves new state-of-the-art results on M-CAMS, a publicly available dataset with 62% F1-score.
We believe our work facilitates causal analysis of depression and suicide risk on social media data, and shows potential for application on other mental health conditions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated with recent advances in inferring users' mental state in social
media posts, we identify and formulate the problem of finding causal indicators
behind mental illness in self-reported text. In the past, we witness the
presence of rule-based studies for causal explanation analysis on curated
Facebook data. The investigation on transformer-based model for multi-class
causal categorization in Reddit posts point to a problem of using long-text
which contains as many as 4000 words. Developing end-to-end transformer-based
models subject to the limitation of maximum-length in a given instance. To
handle this problem, we use Longformer and deploy its encoding on
transformer-based classifier. The experimental results show that Longformer
achieves new state-of-the-art results on M-CAMS, a publicly available dataset
with 62\% F1-score. Cause-specific analysis and ablation study prove the
effectiveness of Longformer. We believe our work facilitates causal analysis of
depression and suicide risk on social media data, and shows potential for
application on other mental health conditions.
Related papers
- Mental Disorder Classification via Temporal Representation of Text [33.47304614659701]
Mental disorder prediction from social media posts is challenging due to the complexities of sequential text data.
We propose a novel framework which compresses the large sequence of chronologically ordered social media posts into a series of numbers.
We demonstrate the generalization capabilities of our framework by outperforming the current SOTA in three different mental conditions.
arXiv Detail & Related papers (2024-06-15T10:53:21Z) - Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study [61.74571814707054]
We evaluate whether every generated sentence is grounded in retrieved documents or the model's pre-training data.
Across 3 datasets and 4 model families, our findings reveal that a significant fraction of generated sentences are consistently ungrounded.
Our results show that while larger models tend to ground their outputs more effectively, a significant portion of correct answers remains compromised by hallucinations.
arXiv Detail & Related papers (2024-04-10T14:50:10Z) - Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models [48.35385912526338]
This paper explores the impact of extending input lengths on the capabilities of Large Language Models (LLMs)
We isolate the effect of input length using multiple versions of the same sample, each being extended with padding of different lengths, types and locations.
We show that the degradation trend appears in every version of our dataset, although at different intensities.
arXiv Detail & Related papers (2024-02-19T16:04:53Z) - Enhanced Labeling Technique for Reddit Text and Fine-Tuned Longformer
Models for Classifying Depression Severity in English and Luganda [0.0]
This research extracts text from Reddit to facilitate the diagnostic process.
It employs a proposed labeling approach to categorize the text and subsequently fine-tunes the Longformer model.
Our findings reveal that the Longformer model outperforms the baseline models in both English (48%) and Luganda (45%) languages.
arXiv Detail & Related papers (2024-01-25T15:28:07Z) - Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach [61.04606493712002]
Susceptibility to misinformation describes the degree of belief in unverifiable claims that is not observable.
Existing susceptibility studies heavily rely on self-reported beliefs.
We propose a computational approach to model users' latent susceptibility levels.
arXiv Detail & Related papers (2023-11-16T07:22:56Z) - Explainable Depression Symptom Detection in Social Media [2.677715367737641]
We propose using transformer-based architectures to detect and explain the appearance of depressive symptom markers in the users' writings.
Our natural language explanations enable clinicians to interpret the models' decisions based on validated symptoms.
arXiv Detail & Related papers (2023-10-20T17:05:27Z) - Semantic Similarity Models for Depression Severity Estimation [53.72188878602294]
This paper presents an efficient semantic pipeline to study depression severity in individuals based on their social media writings.
We use test user sentences for producing semantic rankings over an index of representative training sentences corresponding to depressive symptoms and severity levels.
We evaluate our methods on two Reddit-based benchmarks, achieving 30% improvement over state of the art in terms of measuring depression severity.
arXiv Detail & Related papers (2022-11-14T18:47:26Z) - Explainable Causal Analysis of Mental Health on Social Media Data [0.0]
Multi-class causal categorization for mental health issues on social media has a major challenge of wrong prediction.
Inconsistency among causal explanations/ inappropriate human-annotated inferences in the dataset.
In this work, we find the reason behind inconsistency in accuracy of multi-class causal categorization.
arXiv Detail & Related papers (2022-10-16T03:34:47Z) - CAMS: An Annotated Corpus for Causal Analysis of Mental Health Issues in
Social Media Posts [17.853932382843222]
We introduce a new dataset for Causal Analysis of Mental health issues in Social media posts (CAMS)
Our contributions for causal analysis are two-fold: causal interpretation and causal categorization.
We present experimental results of models learned from CAMS dataset and demonstrate that a classic Logistic Regression model outperforms the next best (CNN-LSTM) model by 4.9% accuracy.
arXiv Detail & Related papers (2022-07-11T07:38:18Z) - Learning Language and Multimodal Privacy-Preserving Markers of Mood from
Mobile Data [74.60507696087966]
Mental health conditions remain underdiagnosed even in countries with common access to advanced medical care.
One promising data source to help monitor human behavior is daily smartphone usage.
We study behavioral markers of daily mood using a recent dataset of mobile behaviors from adolescent populations at high risk of suicidal behaviors.
arXiv Detail & Related papers (2021-06-24T17:46:03Z) - Sentiment Analysis Based on Deep Learning: A Comparative Study [69.09570726777817]
The study of public opinion can provide us with valuable information.
The efficiency and accuracy of sentiment analysis is being hindered by the challenges encountered in natural language processing.
This paper reviews the latest studies that have employed deep learning to solve sentiment analysis problems.
arXiv Detail & Related papers (2020-06-05T16:28:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.