Noise-Tolerant Hybrid Prototypical Learning with Noisy Web Data
- URL: http://arxiv.org/abs/2501.02476v1
- Date: Sun, 05 Jan 2025 08:21:43 GMT
- Title: Noise-Tolerant Hybrid Prototypical Learning with Noisy Web Data
- Authors: Chao Liang, Linchao Zhu, Zongxin Yang, Wei Chen, Yi Yang,
- Abstract summary: We focus on the challenging problem of learning an unbiased classifier from a large number of potentially relevant but noisily labeled web images.
In the few clean and many noisy scenarios, the class prototype can be severely biased due to the presence of irrelevant noisy images.
Our approach considers the diversity of noisy images by explicit division and overcomes the optimization discrepancy issue.
- Score: 72.32706907910477
- License:
- Abstract: We focus on the challenging problem of learning an unbiased classifier from a large number of potentially relevant but noisily labeled web images given only a few clean labeled images. This problem is particularly practical because it reduces the expensive annotation costs by utilizing freely accessible web images with noisy labels. Typically, prototypes are representative images or features used to classify or identify other images. However, in the few clean and many noisy scenarios, the class prototype can be severely biased due to the presence of irrelevant noisy images. The resulting prototypes are less compact and discriminative, as previous methods do not take into account the diverse range of images in the noisy web image collections. On the other hand, the relation modeling between noisy and clean images is not learned for the class prototype generation in an end-to-end manner, which results in a suboptimal class prototype. In this article, we introduce a similarity maximization loss named SimNoiPro. Our SimNoiPro first generates noise-tolerant hybrid prototypes composed of clean and noise-tolerant prototypes and then pulls them closer to each other. Our approach considers the diversity of noisy images by explicit division and overcomes the optimization discrepancy issue. This enables better relation modeling between clean and noisy images and helps extract judicious information from the noisy image set. The evaluation results on two extended few-shot classification benchmarks confirm that our SimNoiPro outperforms prior methods in measuring image relations and cleaning noisy data.
Related papers
- UnSeg: One Universal Unlearnable Example Generator is Enough against All Image Segmentation [64.01742988773745]
An increasing privacy concern exists regarding training large-scale image segmentation models on unauthorized private data.
We exploit the concept of unlearnable examples to make images unusable to model training by generating and adding unlearnable noise into the original images.
We empirically verify the effectiveness of UnSeg across 6 mainstream image segmentation tasks, 10 widely used datasets, and 7 different network architectures.
arXiv Detail & Related papers (2024-10-13T16:34:46Z) - CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes [93.71909293023663]
Cross-modality Aligned Prototypes (CAPro) is a unified contrastive learning framework to learn visual representations with correct semantics.
CAPro achieves new state-of-the-art performance and exhibits robustness to open-set recognition.
arXiv Detail & Related papers (2023-10-15T07:20:22Z) - Multi-view Self-supervised Disentanglement for General Image Denoising [22.28610604896056]
We propose to learn to disentangle the noisy image, under the intuitive assumption that different corrupted versions of the same clean image share a common latent space.
A self-supervised learning framework is proposed to achieve the goal, without looking at the latent clean image.
By taking two different corrupted versions of the same image as input, the proposed Multi-view Self-supervised Disentanglement (MeD) approach learns to disentangle the latent clean features from the corruptions and recover the clean image consequently.
arXiv Detail & Related papers (2023-09-10T14:54:44Z) - Embedding contrastive unsupervised features to cluster in- and
out-of-distribution noise in corrupted image datasets [18.19216557948184]
Using search engines for web image retrieval is a tempting alternative to manual curation when creating an image dataset.
Their main drawback remains the proportion of incorrect (noisy) samples retrieved.
We propose a two stage algorithm starting with a detection step where we use unsupervised contrastive feature learning.
We find that the alignment and uniformity principles of contrastive learning allow OOD samples to be linearly separated from ID samples on the unit hypersphere.
arXiv Detail & Related papers (2022-07-04T16:51:56Z) - Attention-Aware Noisy Label Learning for Image Classification [97.26664962498887]
Deep convolutional neural networks (CNNs) learned on large-scale labeled samples have achieved remarkable progress in computer vision.
The cheapest way to obtain a large body of labeled visual data is to crawl from websites with user-supplied labels, such as Flickr.
This paper proposes the attention-aware noisy label learning approach to improve the discriminative capability of the network trained on datasets with potential label noise.
arXiv Detail & Related papers (2020-09-30T15:45:36Z) - Weakly Supervised Learning with Side Information for Noisy Labeled
Images [41.37584960976829]
We present an efficient weakly supervised learning by using a Side Information Network (SINet)
The proposed SINet consists of a visual prototype module and a noise weighting module.
The propsed SINet can largely alleviate the negative impact of noisy image labels.
arXiv Detail & Related papers (2020-08-25T05:43:48Z) - Data-driven Meta-set Based Fine-Grained Visual Classification [61.083706396575295]
We propose a data-driven meta-set based approach to deal with noisy web images for fine-grained recognition.
Specifically, guided by a small amount of clean meta-set, we train a selection net in a meta-learning manner to distinguish in- and out-of-distribution noisy images.
arXiv Detail & Related papers (2020-08-06T03:04:16Z) - Dual Adversarial Network: Toward Real-world Noise Removal and Noise
Generation [52.75909685172843]
Real-world image noise removal is a long-standing yet very challenging task in computer vision.
We propose a novel unified framework to deal with the noise removal and noise generation tasks.
Our method learns the joint distribution of the clean-noisy image pairs.
arXiv Detail & Related papers (2020-07-12T09:16:06Z) - A Committee of Convolutional Neural Networks for Image Classication in
the Concurrent Presence of Feature and Label Noise [0.0]
This piece of research is the first attempt to address the problem of concurrent occurrence of both types of noise.
We experimentally proved that the difference by which committees beat single models increases along with noise level.
We propose three committee selection algorithms that outperform a strong baseline algorithm.
arXiv Detail & Related papers (2020-04-19T00:22:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.