Tune It or Don't Use It: Benchmarking Data-Efficient Image
Classification
- URL: http://arxiv.org/abs/2108.13122v1
- Date: Mon, 30 Aug 2021 11:24:51 GMT
- Title: Tune It or Don't Use It: Benchmarking Data-Efficient Image
Classification
- Authors: Lorenzo Brigato, Bj\"orn Barz, Luca Iocchi, Joachim Denzler
- Abstract summary: We design a benchmark for data-efficient image classification consisting of six diverse datasets spanning various domains.
We re-evaluate the standard cross-entropy baseline and eight methods for data-efficient deep learning published between 2017 and 2021 at renowned venues.
tuning learning rate, weight decay, and batch size on a separate validation split results in a highly competitive baseline.
- Score: 9.017660524497389
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data-efficient image classification using deep neural networks in settings,
where only small amounts of labeled data are available, has been an active
research area in the recent past. However, an objective comparison between
published methods is difficult, since existing works use different datasets for
evaluation and often compare against untuned baselines with default
hyper-parameters. We design a benchmark for data-efficient image classification
consisting of six diverse datasets spanning various domains (e.g., natural
images, medical imagery, satellite data) and data types (RGB, grayscale,
multispectral). Using this benchmark, we re-evaluate the standard cross-entropy
baseline and eight methods for data-efficient deep learning published between
2017 and 2021 at renowned venues. For a fair and realistic comparison, we
carefully tune the hyper-parameters of all methods on each dataset.
Surprisingly, we find that tuning learning rate, weight decay, and batch size
on a separate validation split results in a highly competitive baseline, which
outperforms all but one specialized method and performs competitively to the
remaining one.
Related papers
- Additional Look into GAN-based Augmentation for Deep Learning COVID-19
Image Classification [57.1795052451257]
We study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples.
We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems.
The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets.
arXiv Detail & Related papers (2024-01-26T08:28:13Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Exploring Data Redundancy in Real-world Image Classification through
Data Selection [20.389636181891515]
Deep learning models often require large amounts of data for training, leading to increased costs.
We present two data valuation metrics based on Synaptic Intelligence and gradient norms, respectively, to study redundancy in real-world image data.
Online and offline data selection algorithms are then proposed via clustering and grouping based on the examined data values.
arXiv Detail & Related papers (2023-06-25T03:31:05Z) - Evaluating Graph Neural Networks for Link Prediction: Current Pitfalls
and New Benchmarking [66.83273589348758]
Link prediction attempts to predict whether an unseen edge exists based on only a portion of edges of a graph.
A flurry of methods have been introduced in recent years that attempt to make use of graph neural networks (GNNs) for this task.
New and diverse datasets have also been created to better evaluate the effectiveness of these new models.
arXiv Detail & Related papers (2023-06-18T01:58:59Z) - Image Classification with Small Datasets: Overview and Benchmark [0.0]
We systematically organize and connect past studies to consolidate a community that is currently fragmented and scattered.
We propose a common benchmark that allows for an objective comparison of approaches.
We use this benchmark to re-evaluate the standard cross-entropy baseline and ten existing methods published between 2017 and 2021 at renowned venues.
arXiv Detail & Related papers (2022-12-23T17:11:16Z) - Dominant Set-based Active Learning for Text Classification and its
Application to Online Social Media [0.0]
We present a novel pool-based active learning method for the training of large unlabeled corpus with minimum annotation cost.
Our proposed method does not have any parameters to be tuned, making it dataset-independent.
Our method achieves a higher performance in comparison to the state-of-the-art active learning strategies.
arXiv Detail & Related papers (2022-01-28T19:19:03Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - How to distribute data across tasks for meta-learning? [59.608652082495624]
We show that the optimal number of data points per task depends on the budget, but it converges to a unique constant value for large budgets.
Our results suggest a simple and efficient procedure for data collection.
arXiv Detail & Related papers (2021-03-15T15:38:47Z) - A pipeline for fair comparison of graph neural networks in node
classification tasks [4.418753792543564]
Graph neural networks (GNNs) have been investigated for potential applicability in multiple fields that employ graph data.
There are no standard training settings to ensure fair comparisons among new methods.
We introduce a standard, reproducible benchmark to which the same training settings can be applied for node classification.
arXiv Detail & Related papers (2020-12-19T07:43:05Z) - Adversarial Learning for Personalized Tag Recommendation [61.76193196463919]
We propose an end-to-end deep network which can be trained on large-scale datasets.
A joint training of user-preference and visual encoding allows the network to efficiently integrate the visual preference with tagging behavior.
We demonstrate the effectiveness of the proposed model on two different large-scale and publicly available datasets.
arXiv Detail & Related papers (2020-04-01T20:41:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.