Deep k-NN for Noisy Labels
- URL: http://arxiv.org/abs/2004.12289v1
- Date: Sun, 26 Apr 2020 05:15:36 GMT
- Title: Deep k-NN for Noisy Labels
- Authors: Dara Bahri, Heinrich Jiang, Maya Gupta
- Abstract summary: We show that a simple $k$-nearest neighbor-based filtering approach on the logit layer of a preliminary model can remove mislabeled data and produce more accurate models than many recently proposed methods.
- Score: 55.97221021252733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern machine learning models are often trained on examples with noisy
labels that hurt performance and are hard to identify. In this paper, we
provide an empirical study showing that a simple $k$-nearest neighbor-based
filtering approach on the logit layer of a preliminary model can remove
mislabeled training data and produce more accurate models than many recently
proposed methods. We also provide new statistical guarantees into its efficacy.
Related papers
- Foster Adaptivity and Balance in Learning with Noisy Labels [26.309508654960354]
We propose a novel approach named textbfSED to deal with label noise in a textbfSelf-adaptivtextbfE and class-balancetextbfD manner.
A mean-teacher model is then employed to correct labels of noisy samples.
We additionally propose a self-adaptive and class-balanced sample re-weighting mechanism to assign different weights to detected noisy samples.
arXiv Detail & Related papers (2024-07-03T03:10:24Z) - Federated Learning with Extremely Noisy Clients via Negative
Distillation [70.13920804879312]
Federated learning (FL) has shown remarkable success in cooperatively training deep models, while struggling with noisy labels.
We propose a novel approach, called negative distillation (FedNed) to leverage models trained on noisy clients.
FedNed first identifies noisy clients and employs rather than discards the noisy clients in a knowledge distillation manner.
arXiv Detail & Related papers (2023-12-20T01:59:48Z) - Combating Label Noise With A General Surrogate Model For Sample
Selection [84.61367781175984]
We propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically.
We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets.
arXiv Detail & Related papers (2023-10-16T14:43:27Z) - Late Stopping: Avoiding Confidently Learning from Mislabeled Examples [61.00103151680946]
We propose a new framework, Late Stopping, which leverages the intrinsic robust learning ability of DNNs through a prolonged training process.
We empirically observe that mislabeled and clean examples exhibit differences in the number of epochs required for them to be consistently and correctly classified.
Experimental results on benchmark-simulated and real-world noisy datasets demonstrate that the proposed method outperforms state-of-the-art counterparts.
arXiv Detail & Related papers (2023-08-26T12:43:25Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - Learning with Noisy labels via Self-supervised Adversarial Noisy Masking [33.87292143223425]
We propose a novel training approach termed adversarial noisy masking.
It adaptively modulates the input data and label simultaneously, preventing the model to overfit noisy samples.
It is tested on both synthetic and real-world noisy datasets.
arXiv Detail & Related papers (2023-02-14T03:13:26Z) - Influential Rank: A New Perspective of Post-training for Robust Model
against Noisy Labels [23.80449026013167]
We propose a new approach for learning from noisy labels (LNL) via post-training.
We exploit the overfitting property of a trained model to identify mislabeled samples.
Our post-training approach creates great synergies when combined with the existing LNL methods.
arXiv Detail & Related papers (2021-06-14T08:04:18Z) - Learning from Noisy Labels for Entity-Centric Information Extraction [17.50856935207308]
We propose a simple co-regularization framework for entity-centric information extraction.
These models are jointly optimized with task-specific loss, and are regularized to generate similar predictions.
In the end, we can take any of the trained models for inference.
arXiv Detail & Related papers (2021-04-17T22:49:12Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.