Balancing Methods for Multi-label Text Classification with Long-Tailed
Class Distribution
- URL: http://arxiv.org/abs/2109.04712v1
- Date: Fri, 10 Sep 2021 07:39:10 GMT
- Title: Balancing Methods for Multi-label Text Classification with Long-Tailed
Class Distribution
- Authors: Yi Huang, Buse Giledereli, Abdullatif K\"oksal, Arzucan \"Ozg\"ur,
Elif Ozkirimli
- Abstract summary: We introduce the application of balancing loss functions for multi-label text classification.
We find that a distribution-balanced loss function, which inherently addresses both the class imbalance and label linkage problems, outperforms commonly used loss functions.
- Score: 2.3064145892791132
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multi-label text classification is a challenging task because it requires
capturing label dependencies. It becomes even more challenging when class
distribution is long-tailed. Resampling and re-weighting are common approaches
used for addressing the class imbalance problem, however, they are not
effective when there is label dependency besides class imbalance because they
result in oversampling of common labels. Here, we introduce the application of
balancing loss functions for multi-label text classification. We perform
experiments on a general domain dataset with 90 labels (Reuters-21578) and a
domain-specific dataset from PubMed with 18211 labels. We find that a
distribution-balanced loss function, which inherently addresses both the class
imbalance and label linkage problems, outperforms commonly used loss functions.
Distribution balancing methods have been successfully used in the image
recognition field. Here, we show their effectiveness in natural language
processing. Source code is available at
https://github.com/blessu/BalancedLossNLP.
Related papers
- Generating Unbiased Pseudo-labels via a Theoretically Guaranteed
Chebyshev Constraint to Unify Semi-supervised Classification and Regression [57.17120203327993]
threshold-to-pseudo label process (T2L) in classification uses confidence to determine the quality of label.
In nature, regression also requires unbiased methods to generate high-quality labels.
We propose a theoretically guaranteed constraint for generating unbiased labels based on Chebyshev's inequality.
arXiv Detail & Related papers (2023-11-03T08:39:35Z) - Towards Imbalanced Large Scale Multi-label Classification with Partially
Annotated Labels [8.977819892091]
Multi-label classification is a widely encountered problem in daily life, where an instance can be associated with multiple classes.
In this work, we address the issue of label imbalance and investigate how to train neural networks using partial labels.
arXiv Detail & Related papers (2023-07-31T21:50:48Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - TagRec++: Hierarchical Label Aware Attention Network for Question
Categorization [0.3683202928838613]
Online learning systems organize the content according to a well defined taxonomy of hierarchical nature.
The task of categorizing inputs to the hierarchical labels is usually cast as a flat multi-class classification problem.
We formulate the task as a dense retrieval problem to retrieve the appropriate hierarchical labels for each content.
arXiv Detail & Related papers (2022-08-10T05:08:37Z) - Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced
Semi-Supervised Learning [80.05441565830726]
This paper addresses imbalanced semi-supervised learning, where heavily biased pseudo-labels can harm the model performance.
We propose a general pseudo-labeling framework to address the bias motivated by this observation.
We term the novel pseudo-labeling framework for imbalanced SSL as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label.
arXiv Detail & Related papers (2021-06-10T11:58:25Z) - PLM: Partial Label Masking for Imbalanced Multi-label Classification [59.68444804243782]
Neural networks trained on real-world datasets with long-tailed label distributions are biased towards frequent classes and perform poorly on infrequent classes.
We propose a method, Partial Label Masking (PLM), which utilizes this ratio during training.
Our method achieves strong performance when compared to existing methods on both multi-label (MultiMNIST and MSCOCO) and single-label (imbalanced CIFAR-10 and CIFAR-100) image classification datasets.
arXiv Detail & Related papers (2021-05-22T18:07:56Z) - Disentangling Sampling and Labeling Bias for Learning in Large-Output
Spaces [64.23172847182109]
We show that different negative sampling schemes implicitly trade-off performance on dominant versus rare labels.
We provide a unified means to explicitly tackle both sampling bias, arising from working with a subset of all labels, and labeling bias, which is inherent to the data due to label imbalance.
arXiv Detail & Related papers (2021-05-12T15:40:13Z) - All Labels Are Not Created Equal: Enhancing Semi-supervision via Label
Grouping and Co-training [32.45488147013166]
Pseudo-labeling is a key component in semi-supervised learning (SSL)
We propose SemCo, a method which leverages label semantics and co-training to address this problem.
We show that our method achieves state-of-the-art performance across various SSL tasks including 5.6% accuracy improvement on Mini-ImageNet dataset with 1000 labeled examples.
arXiv Detail & Related papers (2021-04-12T07:33:16Z) - Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed
Datasets [98.74153364118898]
We present a new loss function called Distribution-Balanced Loss for the multi-label recognition problems that exhibit long-tailed class distributions.
The Distribution-Balanced Loss tackles these issues through two key modifications to the standard binary cross-entropy loss.
Experiments on both Pascal VOC and COCO show that the models trained with this new loss function achieve significant performance gains.
arXiv Detail & Related papers (2020-07-19T11:50:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.