Related papers: Depression detection in social media posts using transformer-based models and auxiliary features

Depression detection in social media posts using transformer-based models and auxiliary features

URL: http://arxiv.org/abs/2409.20048v1
Date: Mon, 30 Sep 2024 07:53:39 GMT
Title: Depression detection in social media posts using transformer-based models and auxiliary features
Authors: Marios Kerasiotis, Loukas Ilias, Dimitris Askounis,
Abstract summary: Detection of depression in social media posts is crucial due to the increasing prevalence of mental health issues. Traditional machine learning algorithms often fail to capture intricate textual patterns, limiting their effectiveness in identifying depression. This research proposes a neural network architecture leveraging transformer-based models combined with metadata and linguistic markers.
Score: 6.390468088226495
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The detection of depression in social media posts is crucial due to the increasing prevalence of mental health issues. Traditional machine learning algorithms often fail to capture intricate textual patterns, limiting their effectiveness in identifying depression. Existing studies have explored various approaches to this problem but often fall short in terms of accuracy and robustness. To address these limitations, this research proposes a neural network architecture leveraging transformer-based models combined with metadata and linguistic markers. The study employs DistilBERT, extracting information from the last four layers of the transformer, applying learned weights, and averaging them to create a rich representation of the input text. This representation, augmented by metadata and linguistic markers, enhances the model's comprehension of each post. Dropout layers prevent overfitting, and a Multilayer Perceptron (MLP) is used for final classification. Data augmentation techniques, inspired by the Easy Data Augmentation (EDA) methods, are also employed to improve model performance. Using BERT, random insertion and substitution of phrases generate additional training data, focusing on balancing the dataset by augmenting underrepresented classes. The proposed model achieves weighted Precision, Recall, and F1-scores of 84.26%, 84.18%, and 84.15%, respectively. The augmentation techniques significantly enhance model performance, increasing the weighted F1-score from 72.59% to 84.15%.

Related papers

Granularity Matters in Long-Tail Learning [62.30734737735273]
We offer a novel perspective on long-tail learning, inspired by an observation: datasets with finer granularity tend to be less affected by data imbalance. We introduce open-set auxiliary classes that are visually similar to existing ones, aiming to enhance representation learning for both head and tail classes. To prevent the overwhelming presence of auxiliary classes from disrupting training, we introduce a neighbor-silencing loss.
arXiv Detail & Related papers (2024-10-21T13:06:21Z)
How Does Data Diversity Shape the Weight Landscape of Neural Networks? [2.89287673224661]
We investigate the impact of dropout, weight decay, and noise augmentation on the parameter space of neural networks. We observe that diverse data influences the weight landscape in a similar fashion as dropout. We conclude that synthetic data can bring more diversity into real input data, resulting in a better performance on out-of-distribution test instances.
arXiv Detail & Related papers (2024-10-18T16:57:05Z)
Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models [66.1595537904019]
Large language models (LLMs) can act as gradient priors in a zero-shot setting. We introduce LM-GC, a novel method that integrates LLMs with arithmetic coding.
arXiv Detail & Related papers (2024-09-26T13:38:33Z)
Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture. We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z)
Enhancing Depressive Post Detection in Bangla: A Comparative Study of TF-IDF, BERT and FastText Embeddings [0.0]
This study introduces a well-grounded approach to identify depressive social media posts in Bangla. The dataset used in this work, annotated by domain experts, includes both depressive and non-depressive posts. To address the issue of class imbalance, we utilised random oversampling for the minority class.
arXiv Detail & Related papers (2024-07-12T11:40:17Z)
Multi-Epoch learning with Data Augmentation for Deep Click-Through Rate Prediction [53.88231294380083]
We introduce a novel Multi-Epoch learning with Data Augmentation (MEDA) framework, suitable for both non-continual and continual learning scenarios. MEDA minimizes overfitting by reducing the dependency of the embedding layer on subsequent training data. Our findings confirm that pre-trained layers can adapt to new embedding spaces, enhancing performance without overfitting.
arXiv Detail & Related papers (2024-06-27T04:00:15Z)
Calibration of Transformer-based Models for Identifying Stress and Depression in Social Media [0.0]
We present the first study in the task of depression and stress detection in social media, which injects extra linguistic information in transformer-based models. Specifically, the proposed approach employs a Multimodal Adaptation Gate for creating the combined embeddings, which are given as input to a BERT (or MentalBERT) model. We test our proposed approaches in three publicly available datasets and demonstrate that the integration of linguistic features into transformer-based models presents a surge in the performance.
arXiv Detail & Related papers (2023-05-26T10:19:04Z)
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs. We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting. Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z)
Transformer with Selective Shuffled Position Embedding and Key-Patch Exchange Strategy for Early Detection of Knee Osteoarthritis [7.656764569447645]
Knee OsteoArthritis (KOA) is a widespread musculoskeletal disorder that can severely impact the mobility of older individuals. Insufficient medical data presents a significant obstacle for effectively training models due to the high cost associated with data labelling. We propose a novel approach based on the Vision Transformer (ViT) model with original Selective Shuffled Position Embedding (SSPE) and key-patch exchange strategies.
arXiv Detail & Related papers (2023-04-17T15:26:42Z)
A Study on FGSM Adversarial Training for Neural Retrieval [3.2634122554914]
Neural retrieval models have acquired significant effectiveness gains over the last few years compared to term-based methods. However, those models may be brittle when faced to typos, distribution shifts or vulnerable to malicious attacks. We show that one of the most simple adversarial training techniques -- the Fast Gradient Sign Method (FGSM) -- can improve first stage rankers robustness and effectiveness.
arXiv Detail & Related papers (2023-01-25T13:28:54Z)
Improving Classifier Training Efficiency for Automatic Cyberbullying Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods. We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments. The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z)
Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production. One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs) We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.