Image Classification with Small Datasets: Overview and Benchmark
- URL: http://arxiv.org/abs/2212.12478v1
- Date: Fri, 23 Dec 2022 17:11:16 GMT
- Title: Image Classification with Small Datasets: Overview and Benchmark
- Authors: L. Brigato, B. Barz, L. Iocchi, and J. Denzler
- Abstract summary: We systematically organize and connect past studies to consolidate a community that is currently fragmented and scattered.
We propose a common benchmark that allows for an objective comparison of approaches.
We use this benchmark to re-evaluate the standard cross-entropy baseline and ten existing methods published between 2017 and 2021 at renowned venues.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image classification with small datasets has been an active research area in
the recent past. However, as research in this scope is still in its infancy,
two key ingredients are missing for ensuring reliable and truthful progress: a
systematic and extensive overview of the state of the art, and a common
benchmark to allow for objective comparisons between published methods. This
article addresses both issues. First, we systematically organize and connect
past studies to consolidate a community that is currently fragmented and
scattered. Second, we propose a common benchmark that allows for an objective
comparison of approaches. It consists of five datasets spanning various domains
(e.g., natural images, medical imagery, satellite data) and data types (RGB,
grayscale, multispectral). We use this benchmark to re-evaluate the standard
cross-entropy baseline and ten existing methods published between 2017 and 2021
at renowned venues. Surprisingly, we find that thorough hyper-parameter tuning
on held-out validation data results in a highly competitive baseline and
highlights a stunted growth of performance over the years. Indeed, only a
single specialized method dating back to 2019 clearly wins our benchmark and
outperforms the baseline classifier.
Related papers
- Additional Look into GAN-based Augmentation for Deep Learning COVID-19
Image Classification [57.1795052451257]
We study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples.
We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems.
The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets.
arXiv Detail & Related papers (2024-01-26T08:28:13Z) - Evaluating Graph Neural Networks for Link Prediction: Current Pitfalls
and New Benchmarking [66.83273589348758]
Link prediction attempts to predict whether an unseen edge exists based on only a portion of edges of a graph.
A flurry of methods have been introduced in recent years that attempt to make use of graph neural networks (GNNs) for this task.
New and diverse datasets have also been created to better evaluate the effectiveness of these new models.
arXiv Detail & Related papers (2023-06-18T01:58:59Z) - Rethinking Benchmarks for Cross-modal Image-text Retrieval [44.31783230767321]
Cross-modal semantic understanding and matching is a major challenge in image-text retrieval.
In this paper, we review the two common benchmarks and observe that they are insufficient to assess the true capability of models on fine-grained cross-modal semantic matching.
We propose a novel semi-automatic renovation approach to refine coarse-grained sentences into finer-grained ones with little human effort.
The results show that even the state-of-the-art models have much room for improvement in fine-grained semantic understanding.
arXiv Detail & Related papers (2023-04-21T09:07:57Z) - Exploring Weakly Supervised Semantic Segmentation Ensembles for Medical
Imaging Systems [11.693197342734152]
We propose a framework for reliable classification and detection of medical conditions in images.
Our framework achieves that by first utilizing lower threshold CAMs to cover the target object with high certainty.
We have demonstrated an improved dice score of up to 8% on BRATS and 6% on DECATHLON datasets.
arXiv Detail & Related papers (2023-03-14T13:31:05Z) - A Closer Look at Debiased Temporal Sentence Grounding in Videos:
Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video.
Recent studies have found that current benchmark datasets may have obvious moment annotation biases.
We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z) - Open-Set Recognition: A Good Closed-Set Classifier is All You Need [146.6814176602689]
We show that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes.
We use this correlation to boost the performance of the cross-entropy OSR 'baseline' by improving its closed-set accuracy.
We also construct new benchmarks which better respect the task of detecting semantic novelty.
arXiv Detail & Related papers (2021-10-12T17:58:59Z) - WRENCH: A Comprehensive Benchmark for Weak Supervision [66.82046201714766]
benchmark consists of 22 varied real-world datasets for classification and sequence tagging.
We use benchmark to conduct extensive comparisons over more than 100 method variants to demonstrate its efficacy as a benchmark platform.
arXiv Detail & Related papers (2021-09-23T13:47:16Z) - Tune It or Don't Use It: Benchmarking Data-Efficient Image
Classification [9.017660524497389]
We design a benchmark for data-efficient image classification consisting of six diverse datasets spanning various domains.
We re-evaluate the standard cross-entropy baseline and eight methods for data-efficient deep learning published between 2017 and 2021 at renowned venues.
tuning learning rate, weight decay, and batch size on a separate validation split results in a highly competitive baseline.
arXiv Detail & Related papers (2021-08-30T11:24:51Z) - Google Landmarks Dataset v2 -- A Large-Scale Benchmark for
Instance-Level Recognition and Retrieval [9.922132565411664]
We introduce the Google Landmarks dataset v2 (GLDv2), a new benchmark for large-scale, fine-grained instance recognition and image retrieval.
GLDv2 is the largest such dataset to date by a large margin, including over 5M images and 200k distinct instance labels.
The dataset is sourced from Wikimedia Commons, the world's largest crowdsourced collection of landmark photos.
arXiv Detail & Related papers (2020-04-03T22:52:17Z) - NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization [101.13851473792334]
We construct a large-scale congested crowd counting and localization dataset, NWPU-Crowd, consisting of 5,109 images, in a total of 2,133,375 annotated heads with points and boxes.
Compared with other real-world datasets, it contains various illumination scenes and has the largest density range (020,033)
We describe the data characteristics, evaluate the performance of some mainstream state-of-the-art (SOTA) methods, and analyze the new problems that arise on the new data.
arXiv Detail & Related papers (2020-01-10T09:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.