Beyond Images: Label Noise Transition Matrix Estimation for Tasks with
Lower-Quality Features
- URL: http://arxiv.org/abs/2202.01273v1
- Date: Wed, 2 Feb 2022 20:36:09 GMT
- Title: Beyond Images: Label Noise Transition Matrix Estimation for Tasks with
Lower-Quality Features
- Authors: Zhaowei Zhu, Jialu Wang, Yang Liu
- Abstract summary: We propose a practical information-theoretic approach to down-weight the less informative parts of the lower-quality features.
We prove that the celebrated $f$-mutual information measure can often preserve the order when calculated using noisy labels.
- Score: 13.659465403114766
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The label noise transition matrix, denoting the transition probabilities from
clean labels to noisy labels, is crucial knowledge for designing statistically
robust solutions. Existing estimators for noise transition matrices, e.g.,
using either anchor points or clusterability, focus on computer vision tasks
that are relatively easier to obtain high-quality representations. However, for
other tasks with lower-quality features, the uninformative variables may
obscure the useful counterpart and make anchor-point or clusterability
conditions hard to satisfy. We empirically observe the failures of these
approaches on a number of commonly used datasets. In this paper, to handle this
issue, we propose a generally practical information-theoretic approach to
down-weight the less informative parts of the lower-quality features. The
salient technical challenge is to compute the relevant information-theoretical
metrics using only noisy labels instead of clean ones. We prove that the
celebrated $f$-mutual information measure can often preserve the order when
calculated using noisy labels. The necessity and effectiveness of the proposed
method is also demonstrated by evaluating the estimation error on a varied set
of tabular data and text classification tasks with lower-quality features. Code
is available at github.com/UCSC-REAL/Est-T-MI.
Related papers
- Multi-Label Noise Transition Matrix Estimation with Label Correlations:
Theory and Algorithm [73.94839250910977]
Noisy multi-label learning has garnered increasing attention due to the challenges posed by collecting large-scale accurate labels.
The introduction of transition matrices can help model multi-label noise and enable the development of statistically consistent algorithms.
We propose a novel estimator that leverages label correlations without the need for anchor points or precise fitting of noisy class posteriors.
arXiv Detail & Related papers (2023-09-22T08:35:38Z) - Rethinking the Value of Labels for Instance-Dependent Label Noise
Learning [43.481591776038144]
noisy labels in real-world applications often depend on both the true label and the features.
In this work, we tackle instance-dependent label noise with a novel deep generative model that avoids explicitly modeling the noise transition matrix.
Our algorithm leverages casual representation learning and simultaneously identifies the high-level content and style latent factors from the data.
arXiv Detail & Related papers (2023-05-10T15:29:07Z) - Improved Adaptive Algorithm for Scalable Active Learning with Weak
Labeler [89.27610526884496]
Weak Labeler Active Cover (WL-AC) is able to robustly leverage the lower quality weak labelers to reduce the query complexity while retaining the desired level of accuracy.
We show its effectiveness on the corrupted-MNIST dataset by significantly reducing the number of labels while keeping the same accuracy as in passive learning.
arXiv Detail & Related papers (2022-11-04T02:52:54Z) - Robust Meta-learning with Sampling Noise and Label Noise via
Eigen-Reptile [78.1212767880785]
meta-learner is prone to overfitting since there are only a few available samples.
When handling the data with noisy labels, the meta-learner could be extremely sensitive to label noise.
We present Eigen-Reptile (ER) that updates the meta- parameters with the main direction of historical task-specific parameters.
arXiv Detail & Related papers (2022-06-04T08:48:02Z) - Semi-Supervised Cascaded Clustering for Classification of Noisy Label
Data [0.3441021278275805]
The performance of supervised classification techniques often deteriorates when the data has noisy labels.
Most of the approaches addressing the noisy label data rely on deep neural networks (DNN) that require huge datasets for classification tasks.
We propose a semi-supervised cascaded clustering algorithm to extract patterns and generate a cascaded tree of classes in such datasets.
arXiv Detail & Related papers (2022-05-04T17:42:22Z) - Clusterability as an Alternative to Anchor Points When Learning with
Noisy Labels [7.920797564912219]
We propose an efficient estimation procedure based on a clusterability condition.
Compared with methods using anchor points, our approach uses substantially more instances and benefits from a much better sample complexity.
arXiv Detail & Related papers (2021-02-10T07:22:56Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - Extended T: Learning with Mixed Closed-set and Open-set Noisy Labels [86.5943044285146]
The label noise transition matrix $T$ reflects the probabilities that true labels flip into noisy ones.
In this paper, we focus on learning under the mixed closed-set and open-set label noise.
Our method can better model the mixed label noise, following its more robust performance than the prior state-of-the-art label-noise learning methods.
arXiv Detail & Related papers (2020-12-02T02:42:45Z) - Meta Transition Adaptation for Robust Deep Learning with Noisy Labels [61.8970957519509]
This study proposes a new meta-transition-learning strategy for the task.
Specifically, through the sound guidance of a small set of meta data with clean labels, the noise transition matrix and the classifier parameters can be mutually ameliorated.
Our method can more accurately extract the transition matrix, naturally following its more robust performance than prior arts.
arXiv Detail & Related papers (2020-06-10T07:27:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.