Related papers: Re-assessing ImageNet: How aligned is its single-label assumption with its multi-label nature?

Re-assessing ImageNet: How aligned is its single-label assumption with its multi-label nature?

URL: http://arxiv.org/abs/2412.18409v1
Date: Tue, 24 Dec 2024 12:55:31 GMT
Title: Re-assessing ImageNet: How aligned is its single-label assumption with its multi-label nature?
Authors: Esla Timothy Anzaku, Seyed Amir Mousavi, Arnout Van Messem, Wesley De Neve,
Abstract summary: We analyze the effectiveness of pre-trained state-of-the-art deep neural network (DNN) models on ImageNet and one of its variants, ImageNetV2.<n>Our findings show that these reported declines are largely attributable to a characteristic of the dataset that has not received sufficient attention.<n>Our findings highlight the importance of considering the multi-label nature of the ImageNet dataset during benchmarking.
Score: 1.4828022319975973
License: http://creativecommons.org/licenses/by/4.0/
Abstract: ImageNet, an influential dataset in computer vision, is traditionally evaluated using single-label classification, which assumes that an image can be adequately described by a single concept or label. However, this approach may not fully capture the complex semantics within the images available in ImageNet, potentially hindering the development of models that effectively learn these intricacies. This study critically examines the prevalent single-label benchmarking approach and advocates for a shift to multi-label benchmarking for ImageNet. This shift would enable a more comprehensive assessment of the capabilities of deep neural network (DNN) models. We analyze the effectiveness of pre-trained state-of-the-art DNNs on ImageNet and one of its variants, ImageNetV2. Studies in the literature have reported unexpected accuracy drops of 11% to 14% on ImageNetV2. Our findings show that these reported declines are largely attributable to a characteristic of the dataset that has not received sufficient attention -- the proportion of images with multiple labels. Taking this characteristic into account, the results of our experiments provide evidence that there is no substantial degradation in effectiveness on ImageNetV2. Furthermore, we acknowledge that ImageNet pre-trained models exhibit some capability at capturing the multi-label nature of the dataset even though they were trained under the single-label assumption. Consequently, we propose a new evaluation approach to augment existing approaches that assess this capability. Our findings highlight the importance of considering the multi-label nature of the ImageNet dataset during benchmarking. Failing to do so could lead to incorrect conclusions regarding the effectiveness of DNNs and divert research efforts from addressing other substantial challenges related to the reliability and robustness of these models.

Related papers

ImagiNet: A Multi-Content Benchmark for Synthetic Image Detection [0.0]
We introduce ImagiNet, a dataset of 200K examples spanning four categories: photos, paintings, faces, and miscellaneous. Synthetic images in ImagiNet are produced with both open-source and proprietary generators, whereas real counterparts for each content type are collected from public datasets.
arXiv Detail & Related papers (2024-07-29T13:57:24Z)
ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object [78.58860252442045]
We introduce generative model as a data source for hard images that benchmark deep models' robustness. We are able to generate images with more diversified backgrounds, textures, and materials than any prior work, where we term this benchmark as ImageNet-D. Our work suggests that diffusion models can be an effective source to test vision models.
arXiv Detail & Related papers (2024-03-27T17:23:39Z)
Robustifying Deep Vision Models Through Shape Sensitization [19.118696557797957]
We propose a simple, lightweight adversarial augmentation technique that explicitly incentivizes the network to learn holistic shapes. Our augmentations superpose edgemaps from one image onto another image with shuffled patches, using a randomly determined mixing proportion. We show that our augmentations significantly improve classification accuracy and robustness measures on a range of datasets and neural architectures.
arXiv Detail & Related papers (2022-11-14T11:17:46Z)
Image Quality Assessment using Contrastive Learning [50.265638572116984]
We train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem. We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models. Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets.
arXiv Detail & Related papers (2021-10-25T21:01:00Z)
Uncertainty-Aware Semi-Supervised Few Shot Segmentation [9.098329723771116]
Few shot segmentation (FSS) aims to learn pixel-level classification of a target object in a query image using only a few annotated support samples. This is challenging as it requires modeling appearance variations of target objects and the diverse visual cues between query and support images with limited information. We propose a semi-supervised FSS strategy that leverages additional prototypes from unlabeled images with uncertainty guided pseudo label refinement.
arXiv Detail & Related papers (2021-10-18T00:37:46Z)
To be Critical: Self-Calibrated Weakly Supervised Learning for Salient Object Detection [95.21700830273221]
Weakly-supervised salient object detection (WSOD) aims to develop saliency models using image-level annotations. We propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions. We prove that even a much smaller dataset with well-matched annotations can facilitate models to achieve better performance as well as generalizability.
arXiv Detail & Related papers (2021-09-04T02:45:22Z)
If your data distribution shifts, use self-learning [24.23584770840611]
Self-learning techniques like entropy and pseudo-labeling are simple and effective at improving performance of a deployed computer vision model under systematic domain shifts. We conduct a wide range of large-scale experiments and show consistent improvements irrespective of the model architecture.
arXiv Detail & Related papers (2021-04-27T01:02:15Z)
Automated Cleanup of the ImageNet Dataset by Model Consensus, Explainability and Confident Learning [0.0]
ImageNet was the backbone of various convolutional neural networks (CNNs) trained on ILSVRC12Net. This paper describes automated applications based on model consensus, explainability and confident learning to correct labeling mistakes. The ImageNet-Clean improves the model performance by 2-2.4 % for SqueezeNet and EfficientNet-B0 models.
arXiv Detail & Related papers (2021-03-30T13:16:35Z)
W2WNet: a two-module probabilistic Convolutional Neural Network with embedded data cleansing functionality [2.695466667982714]
Wise2WipedNet (W2WNet) is a new two- module Convolutional Neural Network. A Wise module exploits Bayesian inference to identify and discard spurious images during the training. A Wiped module takes care of the final classification while broadcasting information on the prediction confidence at inference time.
arXiv Detail & Related papers (2021-03-24T11:28:59Z)
Are we done with ImageNet? [86.01120671361844]
We develop a more robust procedure for collecting human annotations of the ImageNet validation set. We reassess the accuracy of recently proposed ImageNet classifiers, and find their gains to be substantially smaller than those reported on the original labels. The original ImageNet labels to no longer be the best predictors of this independently-collected set, indicating that their usefulness in evaluating vision models may be nearing an end.
arXiv Detail & Related papers (2020-06-12T13:17:25Z)
From ImageNet to Image Classification: Contextualizing Progress on Benchmarks [99.19183528305598]
We study how specific design choices in the ImageNet creation process impact the fidelity of the resulting dataset. Our analysis pinpoints how a noisy data collection pipeline can lead to a systematic misalignment between the resulting benchmark and the real-world task it serves as a proxy for.
arXiv Detail & Related papers (2020-05-22T17:39:16Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.