Exploiting Web Images for Fine-Grained Visual Recognition by Eliminating
Noisy Samples and Utilizing Hard Ones
- URL: http://arxiv.org/abs/2101.09412v1
- Date: Sat, 23 Jan 2021 03:58:10 GMT
- Title: Exploiting Web Images for Fine-Grained Visual Recognition by Eliminating
Noisy Samples and Utilizing Hard Ones
- Authors: Huafeng Liu, Chuanyi Zhang, Yazhou Yao, Xiushen Wei, Fumin Shen, Jian
Zhang, and Zhenmin Tang
- Abstract summary: We propose a novel approach for removing irrelevant samples from real-world web images during training.
Our approach can alleviate the harmful effects of irrelevant noisy web images and hard examples to achieve better performance.
- Score: 60.07027312916081
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Labeling objects at a subordinate level typically requires expert knowledge,
which is not always available when using random annotators. As such, learning
directly from web images for fine-grained recognition has attracted broad
attention. However, the presence of label noise and hard examples in web images
are two obstacles for training robust fine-grained recognition models.
Therefore, in this paper, we propose a novel approach for removing irrelevant
samples from real-world web images during training, while employing useful hard
examples to update the network. Thus, our approach can alleviate the harmful
effects of irrelevant noisy web images and hard examples to achieve better
performance. Extensive experiments on three commonly used fine-grained datasets
demonstrate that our approach is far superior to current state-of-the-art
web-supervised methods.
Related papers
- CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes [93.71909293023663]
Cross-modality Aligned Prototypes (CAPro) is a unified contrastive learning framework to learn visual representations with correct semantics.
CAPro achieves new state-of-the-art performance and exhibits robustness to open-set recognition.
arXiv Detail & Related papers (2023-10-15T07:20:22Z) - Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images
with Free Attention Masks [64.67735676127208]
Text-to-image diffusion models have shown great potential for benefiting image recognition.
Although promising, there has been inadequate exploration dedicated to unsupervised learning on diffusion-generated images.
We introduce customized solutions by fully exploiting the aforementioned free attention masks.
arXiv Detail & Related papers (2023-08-13T10:07:46Z) - Leveraging Self-Supervision for Cross-Domain Crowd Counting [71.75102529797549]
State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate crowd density.
We train our network to recognize upside-down real images from regular ones and incorporate into it the ability to predict its own uncertainty.
This yields an algorithm that consistently outperforms state-of-the-art cross-domain crowd counting ones without any extra computation at inference time.
arXiv Detail & Related papers (2021-03-30T12:37:55Z) - Attention-Aware Noisy Label Learning for Image Classification [97.26664962498887]
Deep convolutional neural networks (CNNs) learned on large-scale labeled samples have achieved remarkable progress in computer vision.
The cheapest way to obtain a large body of labeled visual data is to crawl from websites with user-supplied labels, such as Flickr.
This paper proposes the attention-aware noisy label learning approach to improve the discriminative capability of the network trained on datasets with potential label noise.
arXiv Detail & Related papers (2020-09-30T15:45:36Z) - Data-driven Meta-set Based Fine-Grained Visual Classification [61.083706396575295]
We propose a data-driven meta-set based approach to deal with noisy web images for fine-grained recognition.
Specifically, guided by a small amount of clean meta-set, we train a selection net in a meta-learning manner to distinguish in- and out-of-distribution noisy images.
arXiv Detail & Related papers (2020-08-06T03:04:16Z) - Multiple instance learning on deep features for weakly supervised object
detection with extreme domain shifts [1.9336815376402716]
Weakly supervised object detection (WSOD) using only image-level annotations has attracted a growing attention over the past few years.
We show that a simple multiple instance approach applied on pre-trained deep features yields excellent performances on non-photographic datasets.
arXiv Detail & Related papers (2020-08-03T20:36:01Z) - ExpertNet: Adversarial Learning and Recovery Against Noisy Labels [8.88412854076574]
We propose a novel framework, composed of Amateur and Expert, which iteratively learn from each other.
The trained Amateur and Expert proactively leverage the images and their noisy labels to infer image classes.
Our empirical evaluations on noisy versions of CIFAR-10, CIFAR-100 and real-world data of Clothing1M show that the proposed model can achieve robust classification against a wide range of noise ratios.
arXiv Detail & Related papers (2020-07-10T11:12:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.