Related papers: DECK: Behavioral Tests to Improve Interpretability and Generalizability of BERT Models Detecting Depression from Text

DECK: Behavioral Tests to Improve Interpretability and Generalizability of BERT Models Detecting Depression from Text

URL: http://arxiv.org/abs/2209.05286v1
Date: Mon, 12 Sep 2022 14:39:46 GMT
Title: DECK: Behavioral Tests to Improve Interpretability and Generalizability of BERT Models Detecting Depression from Text
Authors: Jekaterina Novikova, Ksenia Shkaruta
Abstract summary: Models that accurately detect depression from text are important tools for addressing the post-pandemic mental health crisis. BERT-based classifiers' promising performance and the off-the-shelf availability make them great candidates for this task. We introduce the DECK (DEpression ChecKlist), depression-specific model behavioural tests that allow better interpretability.
Score: 4.269268432906194
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Models that accurately detect depression from text are important tools for addressing the post-pandemic mental health crisis. BERT-based classifiers' promising performance and the off-the-shelf availability make them great candidates for this task. However, these models are known to suffer from performance inconsistencies and poor generalization. In this paper, we introduce the DECK (DEpression ChecKlist), depression-specific model behavioural tests that allow better interpretability and improve generalizability of BERT classifiers in depression domain. We create 23 tests to evaluate BERT, RoBERTa and ALBERT depression classifiers on three datasets, two Twitter-based and one clinical interview-based. Our evaluation shows that these models: 1) are robust to certain gender-sensitive variations in text; 2) rely on the important depressive language marker of the increased use of first person pronouns; 3) fail to detect some other depression symptoms like suicidal ideation. We also demonstrate that DECK tests can be used to incorporate symptom-specific information in the training data and consistently improve generalizability of all three BERT models, with an out-of-distribution F1-score increase of up to 53.93%.

Related papers

Leveraging Large Language Models for Cost-Effective, Multilingual Depression Detection and Severity Assessment [0.7373617024876725]
DeepSeek V3 is the most reliable and cost-effective model for depression detection. The model maintains stably high AUCs for detecting depression in complex diagnostic scenarios.
arXiv Detail & Related papers (2025-04-07T09:58:19Z)
A BERT-Based Summarization approach for depression detection [1.7363112470483526]
Depression is a globally prevalent mental disorder with potentially severe repercussions if not addressed. Machine learning and artificial intelligence can autonomously detect depression indicators from diverse data sources. Our study proposes text summarization as a preprocessing technique to diminish the length and intricacies of input texts.
arXiv Detail & Related papers (2024-09-13T02:14:34Z)
The Relationship Between Speech Features Changes When You Get Depressed: Feature Correlations for Improving Speed and Performance of Depression Detection [69.88072583383085]
This work shows that depression changes the correlation between features extracted from speech. Using such an insight can improve the training speed and performance of depression detectors based on SVMs and LSTMs.
arXiv Detail & Related papers (2023-07-06T09:54:35Z)
Preserving Knowledge Invariance: Rethinking Robustness Evaluation of Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world. We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique. By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z)
Depression detection in social media posts using affective and social norm features [84.12658971655253]
We propose a deep architecture for depression detection from social media posts. We incorporate profanity and morality features of posts and words in our architecture using a late fusion scheme. The inclusion of the proposed features yields state-of-the-art results in both settings.
arXiv Detail & Related papers (2023-03-24T21:26:27Z)
Semantic Similarity Models for Depression Severity Estimation [53.72188878602294]
This paper presents an efficient semantic pipeline to study depression severity in individuals based on their social media writings. We use test user sentences for producing semantic rankings over an index of representative training sentences corresponding to depressive symptoms and severity levels. We evaluate our methods on two Reddit-based benchmarks, achieving 30% improvement over state of the art in terms of measuring depression severity.
arXiv Detail & Related papers (2022-11-14T18:47:26Z)
Deep Temporal Modelling of Clinical Depression through Social Media Text [1.513693945164213]
We develop a model to detect user-level clinical depression based on a user's temporal social media posts. Our model uses a Depression Detection (DSD) classifier, which is trained on the largest existing samples of clinician annotated tweets for clinical depression symptoms.
arXiv Detail & Related papers (2022-10-28T18:31:52Z)
DEPTWEET: A Typology for Social Media Texts to Detect Depression Severities [0.46796109436086664]
We leverage the clinical articulation of depression to build a typology for social media texts for detecting the severity of depression. It emulates the standard clinical assessment procedure Diagnostic and Statistical Manual of Mental Disorders (DSM-5) and Patient Health Questionnaire (PHQ-9) We present a new dataset of 40191 tweets labeled by expert annotators. Each tweet is labeled as 'non-depressed' or 'depressed'
arXiv Detail & Related papers (2022-10-10T08:23:57Z)
Depression Symptoms Modelling from Social Media Text: An Active Learning Approach [1.513693945164213]
We describe an Active Learning framework which uses an initial supervised learning model. We harvest depression symptoms related samples from our large self-curated Depression Tweets Repository. We show that we can produce a final dataset which is the largest of its kind.
arXiv Detail & Related papers (2022-09-06T18:41:57Z)
Does BERT Pretrained on Clinical Notes Reveal Sensitive Data? [70.3631443249802]
We design a battery of approaches intended to recover Personal Health Information from a trained BERT. Specifically, we attempt to recover patient names and conditions with which they are associated. We find that simple probing methods are not able to meaningfully extract sensitive information from BERT trained over the MIMIC-III corpus of EHR.
arXiv Detail & Related papers (2021-04-15T20:40:05Z)
Deep Multi-task Learning for Depression Detection and Prediction in Longitudinal Data [50.02223091927777]
Depression is among the most prevalent mental disorders, affecting millions of people of all ages globally. Machine learning techniques have shown effective in enabling automated detection and prediction of depression for early intervention and treatment. We introduce a novel deep multi-task recurrent neural network to tackle this challenge, in which depression classification is jointly optimized with two auxiliary tasks.
arXiv Detail & Related papers (2020-12-05T05:14:14Z)
Generalized Dilated CNN Models for Depression Detection Using Inverted Vocal Tract Variables [4.050982413149992]
Depression detection using vocal biomarkers is a highly researched area. Findings of existing studies are mostly validated on a single database which limits the generalizability of results. We propose to develop a generalized classifier for depression detection using a dilated Coniculaal Neural Network.
arXiv Detail & Related papers (2020-11-13T03:12:36Z)
An Interpretable End-to-end Fine-tuning Approach for Long Clinical Text [72.62848911347466]
Unstructured clinical text in EHRs contains crucial information for applications including decision support, trial matching, and retrospective research. Recent work has applied BERT-based models to clinical information extraction and text classification, given these models' state-of-the-art performance in other NLP domains. In this work, we propose a novel fine-tuning approach called SnipBERT. Instead of using entire notes, SnipBERT identifies crucial snippets and feeds them into a truncated BERT-based model in a hierarchical manner.
arXiv Detail & Related papers (2020-11-12T17:14:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.