When does Privileged Information Explain Away Label Noise?
- URL: http://arxiv.org/abs/2303.01806v2
- Date: Thu, 1 Jun 2023 08:00:42 GMT
- Title: When does Privileged Information Explain Away Label Noise?
- Authors: Guillermo Ortiz-Jimenez, Mark Collier, Anant Nawalgaria, Alexander
D'Amour, Jesse Berent, Rodolphe Jenatton, Effrosyni Kokiopoulou
- Abstract summary: We investigate the role played by different properties of the PI in explaining away label noise.
We find that PI is most helpful when it allows networks to easily distinguish clean from noisy data.
We propose several enhancements to the state-of-the-art PI methods and demonstrate the potential of PI as a means of tackling label noise.
- Score: 66.9725683097357
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Leveraging privileged information (PI), or features available during training
but not at test time, has recently been shown to be an effective method for
addressing label noise. However, the reasons for its effectiveness are not well
understood. In this study, we investigate the role played by different
properties of the PI in explaining away label noise. Through experiments on
multiple datasets with real PI (CIFAR-N/H) and a new large-scale benchmark
ImageNet-PI, we find that PI is most helpful when it allows networks to easily
distinguish clean from noisy data, while enabling a learning shortcut to
memorize the noisy examples. Interestingly, when PI becomes too predictive of
the target label, PI methods often perform worse than their no-PI baselines.
Based on these findings, we propose several enhancements to the
state-of-the-art PI methods and demonstrate the potential of PI as a means of
tackling label noise. Finally, we show how we can easily combine the resulting
PI approaches with existing no-PI techniques designed to deal with label noise.
Related papers
- Bayesian Prediction-Powered Inference [62.2436697657307]
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data.
We propose a framework for PPI based on Bayesian inference that allows researchers to develop new task-appropriate PPI methods easily.
arXiv Detail & Related papers (2024-05-09T18:08:58Z) - Pi-DUAL: Using Privileged Information to Distinguish Clean from Noisy Labels [47.85182773875054]
We introduce Pi-DUAL, an architecture designed to harness privileged information (PI) to distinguish clean from wrong labels.
Pi-DUAL achieves significant performance improvements on key PI benchmarks, establishing a new state-of-the-art test set accuracy.
Pi-DUAL is a simple, scalable and practical approach for mitigating the effects of label noise in a variety of real-world scenarios with PI.
arXiv Detail & Related papers (2023-10-10T13:08:50Z) - FedNoisy: Federated Noisy Label Learning Benchmark [53.73816587601204]
Federated learning has gained popularity for distributed learning without aggregating sensitive data from clients.
The distributed and isolated nature of data isolation may be complicated by data quality, making it more vulnerable to noisy labels.
We serve the first standardized benchmark that can help researchers fully explore potential federated noisy settings.
arXiv Detail & Related papers (2023-06-20T16:18:14Z) - Towards Effective Visual Representations for Partial-Label Learning [49.91355691337053]
Under partial-label learning (PLL), for each training instance, only a set of ambiguous labels containing the unknown true label is accessible.
Without access to true labels, positive points are predicted using pseudo-labels that are inherently noisy, and negative points often require large batches or momentum encoders.
In this paper, we rethink a state-of-the-artive contrastive method PiCO[PiPi24], which demonstrates significant scope for improvement in representation learning.
arXiv Detail & Related papers (2023-05-10T12:01:11Z) - Noise-robust Graph Learning by Estimating and Leveraging Pairwise
Interactions [123.07967420310796]
This paper bridges the gap by proposing a pairwise framework for noisy node classification on graphs.
PI-GNN relies on the PI as a primary learning proxy in addition to the pointwise learning from the noisy node class labels.
Our proposed framework PI-GNN contributes two novel components: (1) a confidence-aware PI estimation model that adaptively estimates the PI labels, and (2) a decoupled training approach that leverages the estimated PI labels.
arXiv Detail & Related papers (2021-06-14T14:23:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.