CAMS: An Annotated Corpus for Causal Analysis of Mental Health Issues in
Social Media Posts
- URL: http://arxiv.org/abs/2207.04674v1
- Date: Mon, 11 Jul 2022 07:38:18 GMT
- Title: CAMS: An Annotated Corpus for Causal Analysis of Mental Health Issues in
Social Media Posts
- Authors: Muskan Garg, Chandni Saxena, Veena Krishnan, Ruchi Joshi, Sriparna
Saha, Vijay Mago, Bonnie J Dorr
- Abstract summary: We introduce a new dataset for Causal Analysis of Mental health issues in Social media posts (CAMS)
Our contributions for causal analysis are two-fold: causal interpretation and causal categorization.
We present experimental results of models learned from CAMS dataset and demonstrate that a classic Logistic Regression model outperforms the next best (CNN-LSTM) model by 4.9% accuracy.
- Score: 17.853932382843222
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Research community has witnessed substantial growth in the detection of
mental health issues and their associated reasons from analysis of social
media. We introduce a new dataset for Causal Analysis of Mental health issues
in Social media posts (CAMS). Our contributions for causal analysis are
two-fold: causal interpretation and causal categorization. We introduce an
annotation schema for this task of causal analysis. We demonstrate the efficacy
of our schema on two different datasets: (i) crawling and annotating 3155
Reddit posts and (ii) re-annotating the publicly available SDCNL dataset of
1896 instances for interpretable causal analysis. We further combine these into
the CAMS dataset and make this resource publicly available along with
associated source code: https://github.com/drmuskangarg/CAMS. We present
experimental results of models learned from CAMS dataset and demonstrate that a
classic Logistic Regression model outperforms the next best (CNN-LSTM) model by
4.9\% accuracy.
Related papers
- Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents [64.43980129731587]
We propose a causal-inspired inference-time debiasing method called Causal Diagnosis and Correction (CDC)
CDC first diagnoses the bias effect of the perplexity and then separates the bias effect from the overall relevance score.
Experimental results across three domains demonstrate the superior debiasing effectiveness.
arXiv Detail & Related papers (2025-03-11T17:59:00Z) - Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach [61.04606493712002]
Susceptibility to misinformation describes the degree of belief in unverifiable claims that is not observable.
Existing susceptibility studies heavily rely on self-reported beliefs.
We propose a computational approach to model users' latent susceptibility levels.
arXiv Detail & Related papers (2023-11-16T07:22:56Z) - Question-Answering Model for Schizophrenia Symptoms and Their Impact on
Daily Life using Mental Health Forums Data [0.0]
The Mental Health'' forum was used, a forum dedicated to people suffering from schizophrenia and different mental disorders.
It is shown how to pre-process the dataset to convert it into a QA dataset.
The BiBERT, DistilBERT, RoBERTa, and BioBERT models were fine-tuned and evaluated via F1-Score, Exact Match, Precision and Recall.
arXiv Detail & Related papers (2023-09-30T17:50:50Z) - MentaLLaMA: Interpretable Mental Health Analysis on Social Media with
Large Language Models [28.62967557368565]
We build the first multi-task and multi-source interpretable mental health instruction dataset on social media, with 105K data samples.
We use expert-written few-shot prompts and collected labels to prompt ChatGPT and obtain explanations from its responses.
Based on the IMHI dataset and LLaMA2 foundation models, we train MentalLLaMA, the first open-source LLM series for interpretable mental health analysis.
arXiv Detail & Related papers (2023-09-24T06:46:08Z) - Discovering Mental Health Research Topics with Topic Modeling [13.651763262606782]
This study aims to identify general trends in the field and pinpoint high-impact research topics by analyzing a large dataset of mental health research papers.
Our dataset comprises 96,676 research papers pertaining to mental health, enabling us to examine the relationships between different topics using their abstracts.
To enhance our analysis, we also generated word clouds to provide a comprehensive overview of the machine learning models applied in mental health research.
arXiv Detail & Related papers (2023-08-25T05:25:05Z) - Multi-class Categorization of Reasons behind Mental Disturbance in Long
Texts [0.0]
We use Longformer to handle the problem of finding causal indicators behind mental illness in self-reported text.
Experiments show that Longformer achieves new state-of-the-art results on M-CAMS, a publicly available dataset with 62% F1-score.
We believe our work facilitates causal analysis of depression and suicide risk on social media data, and shows potential for application on other mental health conditions.
arXiv Detail & Related papers (2023-04-08T22:44:32Z) - Explainable Causal Analysis of Mental Health on Social Media Data [0.0]
Multi-class causal categorization for mental health issues on social media has a major challenge of wrong prediction.
Inconsistency among causal explanations/ inappropriate human-annotated inferences in the dataset.
In this work, we find the reason behind inconsistency in accuracy of multi-class causal categorization.
arXiv Detail & Related papers (2022-10-16T03:34:47Z) - Causal Intervention Improves Implicit Sentiment Analysis [67.43379729099121]
We propose a causal intervention model for Implicit Sentiment Analysis using Instrumental Variable (ISAIV)
We first review sentiment analysis from a causal perspective and analyze the confounders existing in this task.
Then, we introduce an instrumental variable to eliminate the confounding causal effects, thus extracting the pure causal effect between sentence and sentiment.
arXiv Detail & Related papers (2022-08-19T13:17:57Z) - Analyzing the Effects of Handling Data Imbalance on Learned Features
from Medical Images by Looking Into the Models [50.537859423741644]
Training a model on an imbalanced dataset can introduce unique challenges to the learning problem.
We look deeper into the internal units of neural networks to observe how handling data imbalance affects the learned features.
arXiv Detail & Related papers (2022-04-04T09:38:38Z) - A comprehensive comparative evaluation and analysis of Distributional
Semantic Models [61.41800660636555]
We perform a comprehensive evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT.
The results show that the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous.
We borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models.
arXiv Detail & Related papers (2021-05-20T15:18:06Z) - Two-step penalised logistic regression for multi-omic data with an
application to cardiometabolic syndrome [62.997667081978825]
We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately.
Our approach should be preferred if the goal is to select as many relevant predictors as possible.
Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level.
arXiv Detail & Related papers (2020-08-01T10:36:27Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.