Are we done with ImageNet?
- URL: http://arxiv.org/abs/2006.07159v1
- Date: Fri, 12 Jun 2020 13:17:25 GMT
- Title: Are we done with ImageNet?
- Authors: Lucas Beyer and Olivier J. H\'enaff and Alexander Kolesnikov and
Xiaohua Zhai and A\"aron van den Oord
- Abstract summary: We develop a more robust procedure for collecting human annotations of the ImageNet validation set.
We reassess the accuracy of recently proposed ImageNet classifiers, and find their gains to be substantially smaller than those reported on the original labels.
The original ImageNet labels to no longer be the best predictors of this independently-collected set, indicating that their usefulness in evaluating vision models may be nearing an end.
- Score: 86.01120671361844
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Yes, and no. We ask whether recent progress on the ImageNet classification
benchmark continues to represent meaningful generalization, or whether the
community has started to overfit to the idiosyncrasies of its labeling
procedure. We therefore develop a significantly more robust procedure for
collecting human annotations of the ImageNet validation set. Using these new
labels, we reassess the accuracy of recently proposed ImageNet classifiers, and
find their gains to be substantially smaller than those reported on the
original labels. Furthermore, we find the original ImageNet labels to no longer
be the best predictors of this independently-collected set, indicating that
their usefulness in evaluating vision models may be nearing an end.
Nevertheless, we find our annotation procedure to have largely remedied the
errors in the original labels, reinforcing ImageNet as a powerful benchmark for
future research in visual recognition.
Related papers
- Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - Spurious Features Everywhere -- Large-Scale Detection of Harmful
Spurious Features in ImageNet [36.48282338829549]
In this paper, we develop a framework that allows us to systematically identify spurious features in large datasets like ImageNet.
We validate our results by showing that presence of the harmful spurious feature of a class alone is sufficient to trigger the prediction of that class.
We introduce SpuFix as a simple mitigation method to reduce the dependence of any ImageNet classifier on previously identified harmful spurious features.
arXiv Detail & Related papers (2022-12-09T14:23:25Z) - Reference-guided Pseudo-Label Generation for Medical Semantic
Segmentation [25.76014072179711]
We propose a novel approach to generate supervision for semi-supervised semantic segmentation.
We use a small number of labeled images as reference material and match pixels in an unlabeled image to the semantics of the best fitting pixel in a reference set.
We achieve the same performance as a standard fully supervised model on X-ray anatomy segmentation, albeit 95% fewer labeled images.
arXiv Detail & Related papers (2021-12-01T12:21:24Z) - Semantic-Aware Generation for Self-Supervised Visual Representation
Learning [116.5814634936371]
We advocate for Semantic-aware Generation (SaGe) to facilitate richer semantics rather than details to be preserved in the generated image.
SaGe complements the target network with view-specific features and thus alleviates the semantic degradation brought by intensive data augmentations.
We execute SaGe on ImageNet-1K and evaluate the pre-trained models on five downstream tasks including nearest neighbor test, linear classification, and fine-scaled image recognition.
arXiv Detail & Related papers (2021-11-25T16:46:13Z) - Few-Shot Learning with Part Discovery and Augmentation from Unlabeled
Images [79.34600869202373]
We show that inductive bias can be learned from a flat collection of unlabeled images, and instantiated as transferable representations among seen and unseen classes.
Specifically, we propose a novel part-based self-supervised representation learning scheme to learn transferable representations.
Our method yields impressive results, outperforming the previous best unsupervised methods by 7.74% and 9.24%.
arXiv Detail & Related papers (2021-05-25T12:22:11Z) - Re-labeling ImageNet: from Single to Multi-Labels, from Global to
Localized Labels [34.13899937264952]
ImageNet has been arguably the most popular image classification benchmark, but it is also the one with a significant level of label noise.
Recent studies have shown that many samples contain multiple classes, despite being assumed to be a single-label benchmark.
We argue that the mismatch between single-label annotations and effectively multi-label images is equally, if not more, problematic in the training setup, where random crops are applied.
arXiv Detail & Related papers (2021-01-13T11:55:58Z) - Enhancing Few-Shot Image Classification with Unlabelled Examples [18.03136114355549]
We develop a transductive meta-learning method that uses unlabelled instances to improve few-shot image classification performance.
Our approach combines a regularized neural adaptive feature extractor to achieve improved test-time classification accuracy using unlabelled data.
arXiv Detail & Related papers (2020-06-17T05:42:47Z) - Object Segmentation Without Labels with Large-Scale Generative Models [43.679717400251924]
Recent rise of unsupervised and self-supervised learning has dramatically reduced the dependency on labeled data.
Large-scale unsupervised models can also perform a more challenging object segmentation task, requiring neither pixel-level nor image-level labeling.
We show that recent unsupervised GANs allow to differentiate between foreground/background pixels, providing high-quality saliency masks.
arXiv Detail & Related papers (2020-06-08T23:30:43Z) - StarNet: towards Weakly Supervised Few-Shot Object Detection [87.80771067891418]
We introduce StarNet - a few-shot model featuring an end-to-end differentiable non-parametric star-model detection and classification head.
Through this head, the backbone is meta-trained using only image-level labels to produce good features for jointly localizing and classifying previously unseen categories of few-shot test tasks.
Being a few-shot detector, StarNet does not require any bounding box annotations, neither during pre-training nor for novel classes adaptation.
arXiv Detail & Related papers (2020-03-15T11:35:28Z) - I Am Going MAD: Maximum Discrepancy Competition for Comparing
Classifiers Adaptively [135.7695909882746]
We name the MAximum Discrepancy (MAD) competition.
We adaptively sample a small test set from an arbitrarily large corpus of unlabeled images.
Human labeling on the resulting model-dependent image sets reveals the relative performance of the competing classifiers.
arXiv Detail & Related papers (2020-02-25T03:32:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.