Confidence HNC: A Network Flow Technique for Binary Classification with Noisy Labels
- URL: http://arxiv.org/abs/2503.02352v1
- Date: Tue, 04 Mar 2025 07:21:40 GMT
- Title: Confidence HNC: A Network Flow Technique for Binary Classification with Noisy Labels
- Authors: Dorit Hochbaum, Torpong Nitayanont,
- Abstract summary: We consider a classification method that balances two objectives: large similarity within the samples in the cluster, and large dissimilarity between the cluster and its complement.<n>The method, referred to as HNC or SNC, requires seed nodes, or labeled samples, at least one of which is in the cluster and at least one in the complement.<n>The contribution here is the new method in the presence of noisy labels, based on HNC, called Confidence HNC.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider here a classification method that balances two objectives: large similarity within the samples in the cluster, and large dissimilarity between the cluster and its complement. The method, referred to as HNC or SNC, requires seed nodes, or labeled samples, at least one of which is in the cluster and at least one in the complement. Other than that, the method relies only on the relationship between the samples. The contribution here is the new method in the presence of noisy labels, based on HNC, called Confidence HNC, in which we introduce confidence weights that allow the given labels of labeled samples to be violated, with a penalty that reflects the perceived correctness of each given label. If a label is violated then it is interpreted that the label was noisy. The method involves a representation of the problem as a graph problem with hyperparameters that is solved very efficiently by the network flow technique of parametric cut. We compare the performance of the new method with leading algorithms on both real and synthetic data with noisy labels and demonstrate that it delivers improved performance in terms of classification accuracy as well as noise detection capability.
Related papers
- Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - Instance-dependent Label Distribution Estimation for Learning with Label Noise [20.479674500893303]
Noise transition matrix (NTM) estimation is a promising approach for learning with label noise.
We propose an Instance-dependent Label Distribution Estimation (ILDE) method to learn from noisy labels for image classification.
Our results indicate that the proposed ILDE method outperforms all competing methods, no matter whether the noise is synthetic or real noise.
arXiv Detail & Related papers (2022-12-16T10:13:25Z) - Plug-and-Play Pseudo Label Correction Network for Unsupervised Person
Re-identification [36.3733132520186]
We propose a graph-based pseudo label correction network (GLC) to refine the pseudo labels in the manner of supervised clustering.
GLC learns to rectify the initial noisy labels by means of the relationship constraints between samples on the k Nearest Neighbor graph.
Our method is widely compatible with various clustering-based methods and promotes the state-of-the-art performance consistently.
arXiv Detail & Related papers (2022-06-14T05:59:37Z) - S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise.
In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space.
Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z) - Rethinking Pseudo Labels for Semi-Supervised Object Detection [84.697097472401]
We introduce certainty-aware pseudo labels tailored for object detection.
We dynamically adjust the thresholds used to generate pseudo labels and reweight loss functions for each category to alleviate the class imbalance problem.
Our approach improves supervised baselines by up to 10% AP using only 1-10% labeled data from COCO.
arXiv Detail & Related papers (2021-06-01T01:32:03Z) - Boosting Semi-Supervised Face Recognition with Noise Robustness [54.342992887966616]
This paper presents an effective solution to semi-supervised face recognition that is robust to the label noise aroused by the auto-labelling.
We develop a semi-supervised face recognition solution, named Noise Robust Learning-Labelling (NRoLL), which is based on the robust training ability empowered by GN.
arXiv Detail & Related papers (2021-05-10T14:43:11Z) - A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks.
We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z) - Predictive K-means with local models [0.028675177318965035]
Predictive clustering seeks to obtain the best of the two worlds.
We present two new algorithms using this technique and show on a variety of data sets that they are competitive for prediction performance.
arXiv Detail & Related papers (2020-12-16T10:49:36Z) - GANs for learning from very high class conditional noisy labels [1.6516902135723865]
We use Generative Adversarial Networks (GANs) to design a class conditional label noise (CCN) robust scheme for binary classification.
It first generates a set of correctly labelled data points from noisy labelled data and 0.1% or 1% clean labels.
arXiv Detail & Related papers (2020-10-19T15:01:11Z) - Improving Face Recognition by Clustering Unlabeled Faces in the Wild [77.48677160252198]
We propose a novel identity separation method based on extreme value theory.
It greatly reduces the problems caused by overlapping-identity label noise.
Experiments on both controlled and real settings demonstrate our method's consistent improvements.
arXiv Detail & Related papers (2020-07-14T12:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.