Addressing Data Imbalance in Transformer-Based Multi-Label Emotion Detection with Weighted Loss
- URL: http://arxiv.org/abs/2507.11384v1
- Date: Tue, 15 Jul 2025 14:53:33 GMT
- Title: Addressing Data Imbalance in Transformer-Based Multi-Label Emotion Detection with Weighted Loss
- Authors: Xia Cui,
- Abstract summary: This paper explores the application of a simple weighted loss function to Transformer-based models for multi-label emotion detection.<n>We evaluate BERT, RoBERTa, and BART on the BRIGHTER dataset, using evaluation metrics such as Micro F1, Macro F1, ROC-AUC, Accuracy, and Jaccard similarity coefficients.<n>The results demonstrate that the weighted loss function improves performance on high-frequency emotion classes but shows limited impact on minority classes.
- Score: 0.7614628596146602
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper explores the application of a simple weighted loss function to Transformer-based models for multi-label emotion detection in SemEval-2025 Shared Task 11. Our approach addresses data imbalance by dynamically adjusting class weights, thereby enhancing performance on minority emotion classes without the computational burden of traditional resampling methods. We evaluate BERT, RoBERTa, and BART on the BRIGHTER dataset, using evaluation metrics such as Micro F1, Macro F1, ROC-AUC, Accuracy, and Jaccard similarity coefficients. The results demonstrate that the weighted loss function improves performance on high-frequency emotion classes but shows limited impact on minority classes. These findings underscore both the effectiveness and the challenges of applying this approach to imbalanced multi-label emotion detection.
Related papers
- WSS-CL: Weight Saliency Soft-Guided Contrastive Learning for Efficient Machine Unlearning Image Classification [0.0]
We introduce a new two-phase efficient machine unlearning method for image classification, in terms of weight saliency.<n>Our method is called weight saliency soft-guided contrastive learning for efficient machine unlearning image classification (WSS-CL)<n>Our proposed method yields much-improved unlearning efficacy with negligible performance loss compared to state-of-the-art approaches.
arXiv Detail & Related papers (2025-08-06T10:47:36Z) - EAP-GP: Mitigating Saturation Effect in Gradient-based Automated Circuit Identification [62.611812892924156]
We propose Edge Patching with GradPath (EAP-GP) to address the saturation effect.<n>EAP-GP introduces an integration path, starting from the input and adaptively following the direction of the difference between the gradients of corrupted and clean inputs to avoid the saturated region.<n>We evaluate EAP-GP on 6 datasets using GPT-2 Small, GPT-2 Medium, and GPT-2 XL.
arXiv Detail & Related papers (2025-02-07T16:04:57Z) - Energy Score-based Pseudo-Label Filtering and Adaptive Loss for Imbalanced Semi-supervised SAR target recognition [1.2035771704626825]
Existing semi-supervised SAR ATR algorithms show low recognition accuracy in the case of class imbalance.
This work offers a non-balanced semi-supervised SAR target recognition approach using dynamic energy scores and adaptive loss.
arXiv Detail & Related papers (2024-11-06T14:45:16Z) - Improving Arabic Multi-Label Emotion Classification using Stacked Embeddings and Hybrid Loss Function [4.149971421068989]
This study uses stacked embeddings, meta-learning, and a hybrid loss function to enhance multi-label emotion classification for the Arabic language.
To further improve performance, a hybrid loss function is introduced, incorporating class weighting, label correlation, and contrastive learning.
Experiments validate the proposed model's performance across key metrics such as Precision, Recall, F1-Score, Jaccard Accuracy, and Hamming Loss.
arXiv Detail & Related papers (2024-10-04T23:37:21Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Fraud Detection Using Optimized Machine Learning Tools Under Imbalance
Classes [0.304585143845864]
Fraud detection with smart versions of machine learning (ML) tools is essential to assure safety.
We investigate four state-of-the-art ML techniques, namely, logistic regression, decision trees, random forest, and extreme gradient boost.
For phishing website URLs and credit card fraud transaction datasets, the results indicate that extreme gradient boost trained on the original data shows trustworthy performance.
arXiv Detail & Related papers (2022-09-04T15:30:23Z) - Scale-Equivalent Distillation for Semi-Supervised Object Detection [57.59525453301374]
Recent Semi-Supervised Object Detection (SS-OD) methods are mainly based on self-training, generating hard pseudo-labels by a teacher model on unlabeled data as supervisory signals.
We analyze the challenges these methods meet with the empirical experiment results.
We introduce a novel approach, Scale-Equivalent Distillation (SED), which is a simple yet effective end-to-end knowledge distillation framework robust to large object size variance and class imbalance.
arXiv Detail & Related papers (2022-03-23T07:33:37Z) - FairIF: Boosting Fairness in Deep Learning via Influence Functions with
Validation Set Sensitive Attributes [51.02407217197623]
We propose a two-stage training algorithm named FAIRIF.
It minimizes the loss over the reweighted data set where the sample weights are computed.
We show that FAIRIF yields models with better fairness-utility trade-offs against various types of bias.
arXiv Detail & Related papers (2022-01-15T05:14:48Z) - Alleviating Noisy-label Effects in Image Classification via Probability
Transition Matrix [30.532481130511137]
Deep-learning-based image classification frameworks often suffer from the noisy label problem caused by the inter-observer variation.
We propose a plugin module, namely noise ignoring block (NIB), to separate the hard samples from the mislabeled ones.
Our NIB module consistently improves the performances of the state-of-the-art robust training methods.
arXiv Detail & Related papers (2021-10-17T17:01:57Z) - Noise-resistant Deep Metric Learning with Ranking-based Instance
Selection [59.286567680389766]
We propose a noise-resistant training technique for DML, which we name Probabilistic Ranking-based Instance Selection with Memory (PRISM)
PRISM identifies noisy data in a minibatch using average similarity against image features extracted from several previous versions of the neural network.
To alleviate the high computational cost brought by the memory bank, we introduce an acceleration method that replaces individual data points with the class centers.
arXiv Detail & Related papers (2021-03-30T03:22:17Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.