Label Confusion Learning to Enhance Text Classification Models
- URL: http://arxiv.org/abs/2012.04987v1
- Date: Wed, 9 Dec 2020 11:34:35 GMT
- Title: Label Confusion Learning to Enhance Text Classification Models
- Authors: Biyang Guo, Songqiao Han, Xiao Han, Hailiang Huang, Ting Lu
- Abstract summary: Label Confusion Model (LCM) learns label confusion to capture semantic overlap among labels.
LCM can generate a better label distribution to replace the original one-hot label vector.
experiments on five text classification benchmark datasets reveal the effectiveness of LCM for several widely used deep learning classification models.
- Score: 3.0251266104313643
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Representing a true label as a one-hot vector is a common practice in
training text classification models. However, the one-hot representation may
not adequately reflect the relation between the instances and labels, as labels
are often not completely independent and instances may relate to multiple
labels in practice. The inadequate one-hot representations tend to train the
model to be over-confident, which may result in arbitrary prediction and model
overfitting, especially for confused datasets (datasets with very similar
labels) or noisy datasets (datasets with labeling errors). While training
models with label smoothing (LS) can ease this problem in some degree, it still
fails to capture the realistic relation among labels. In this paper, we propose
a novel Label Confusion Model (LCM) as an enhancement component to current
popular text classification models. LCM can learn label confusion to capture
semantic overlap among labels by calculating the similarity between instances
and labels during training and generate a better label distribution to replace
the original one-hot label vector, thus improving the final classification
performance. Extensive experiments on five text classification benchmark
datasets reveal the effectiveness of LCM for several widely used deep learning
classification models. Further experiments also verify that LCM is especially
helpful for confused or noisy datasets and superior to the label smoothing
method.
Related papers
- Learning with Confidence: Training Better Classifiers from Soft Labels [0.0]
In supervised machine learning, models are typically trained using data with hard labels, i.e., definite assignments of class membership.
We investigate whether incorporating label uncertainty, represented as discrete probability distributions over the class labels, improves the predictive performance of classification models.
arXiv Detail & Related papers (2024-09-24T13:12:29Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - Deep Partial Multi-Label Learning with Graph Disambiguation [27.908565535292723]
We propose a novel deep Partial multi-Label model with grAph-disambIguatioN (PLAIN)
Specifically, we introduce the instance-level and label-level similarities to recover label confidences.
At each training epoch, labels are propagated on the instance and label graphs to produce relatively accurate pseudo-labels.
arXiv Detail & Related papers (2023-05-10T04:02:08Z) - Bridging the Gap between Model Explanations in Partially Annotated
Multi-label Classification [85.76130799062379]
We study how false negative labels affect the model's explanation.
We propose to boost the attribution scores of the model trained with partial labels to make its explanation resemble that of the model trained with full labels.
arXiv Detail & Related papers (2023-04-04T14:00:59Z) - Label distribution learning via label correlation grid [9.340734188957727]
We propose a textbfLabel textbfCorrelation textbfGrid (LCG) to model the uncertainty of label relationships.
Our network learns the LCG to accurately estimate the label distribution for each instance.
arXiv Detail & Related papers (2022-10-15T03:58:15Z) - Instance-Dependent Partial Label Learning [69.49681837908511]
Partial label learning is a typical weakly supervised learning problem.
Most existing approaches assume that the incorrect labels in each training example are randomly picked as the candidate labels.
In this paper, we consider instance-dependent and assume that each example is associated with a latent label distribution constituted by the real number of each label.
arXiv Detail & Related papers (2021-10-25T12:50:26Z) - Harmless label noise and informative soft-labels in supervised
classification [1.6752182911522517]
Manual labelling of training examples is common practice in supervised learning.
When the labelling task is of non-trivial difficulty, the supplied labels may not be equal to the ground-truth labels, and label noise is introduced into the training dataset.
In particular, when classification difficulty is the only source of label errors, multiple sets of noisy labels can supply more information for the estimation of a classification rule.
arXiv Detail & Related papers (2021-04-07T02:56:11Z) - A Study on the Autoregressive and non-Autoregressive Multi-label
Learning [77.11075863067131]
We propose a self-attention based variational encoder-model to extract the label-label and label-feature dependencies jointly.
Our model can therefore be used to predict all labels in parallel while still including both label-label and label-feature dependencies.
arXiv Detail & Related papers (2020-12-03T05:41:44Z) - PseudoSeg: Designing Pseudo Labels for Semantic Segmentation [78.35515004654553]
We present a re-design of pseudo-labeling to generate structured pseudo labels for training with unlabeled or weakly-labeled data.
We demonstrate the effectiveness of the proposed pseudo-labeling strategy in both low-data and high-data regimes.
arXiv Detail & Related papers (2020-10-19T17:59:30Z) - Unsupervised Person Re-identification via Multi-label Classification [55.65870468861157]
This paper formulates unsupervised person ReID as a multi-label classification task to progressively seek true labels.
Our method starts by assigning each person image with a single-class label, then evolves to multi-label classification by leveraging the updated ReID model for label prediction.
To boost the ReID model training efficiency in multi-label classification, we propose the memory-based multi-label classification loss (MMCL)
arXiv Detail & Related papers (2020-04-20T12:13:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.