A Strong Baseline for the VIPriors Data-Efficient Image Classification
Challenge
- URL: http://arxiv.org/abs/2109.13561v1
- Date: Tue, 28 Sep 2021 08:45:15 GMT
- Title: A Strong Baseline for the VIPriors Data-Efficient Image Classification
Challenge
- Authors: Bj\"orn Barz, Lorenzo Brigato, Luca Iocchi, Joachim Denzler
- Abstract summary: We present a strong baseline for data-efficient image classification on the VIPriors challenge dataset.
Our baseline achieves 69.7% accuracy and outperforms 50% of submissions to the VIPriors 2021 challenge.
- Score: 9.017660524497389
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning from limited amounts of data is the hallmark of intelligence,
requiring strong generalization and abstraction skills. In a machine learning
context, data-efficient methods are of high practical importance since data
collection and annotation are prohibitively expensive in many domains. Thus,
coordinated efforts to foster progress in this area emerged recently, e.g., in
the form of dedicated workshops and competitions. Besides a common benchmark,
measuring progress requires strong baselines. We present such a strong baseline
for data-efficient image classification on the VIPriors challenge dataset,
which is a sub-sampled version of ImageNet-1k with 100 images per class. We do
not use any methods tailored to data-efficient classification but only standard
models and techniques as well as common competition tricks and thorough
hyper-parameter tuning. Our baseline achieves 69.7% accuracy on the VIPriors
image classification dataset and outperforms 50% of submissions to the VIPriors
2021 challenge.
Related papers
- Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot
Classification [49.36348058247138]
We tackle the problem of cross-domain few-shot classification by making a small proportion of unlabeled images in the target domain accessible in the training stage.
We meticulously design a cross-level knowledge distillation method, which can strengthen the ability of the model to extract more discriminative features in the target dataset.
Our approach can surpass the previous state-of-the-art method, Dynamic-Distillation, by 5.44% on 1-shot and 1.37% on 5-shot classification tasks.
arXiv Detail & Related papers (2023-11-04T12:28:04Z) - No Data Augmentation? Alternative Regularizations for Effective Training
on Small Datasets [0.0]
We study alternative regularization strategies to push the limits of supervised learning on small image classification datasets.
In particular, we employ a agnostic to select (semi) optimal learning rate and weight decay couples via the norm of model parameters.
We reach a test accuracy of 66.5%, on par with the best state-of-the-art methods.
arXiv Detail & Related papers (2023-09-04T16:13:59Z) - CSP: Self-Supervised Contrastive Spatial Pre-Training for
Geospatial-Visual Representations [90.50864830038202]
We present Contrastive Spatial Pre-Training (CSP), a self-supervised learning framework for geo-tagged images.
We use a dual-encoder to separately encode the images and their corresponding geo-locations, and use contrastive objectives to learn effective location representations from images.
CSP significantly boosts the model performance with 10-34% relative improvement with various labeled training data sampling ratios.
arXiv Detail & Related papers (2023-05-01T23:11:18Z) - Semi-Supervised Image Captioning by Adversarially Propagating Labeled
Data [95.0476489266988]
We present a novel data-efficient semi-supervised framework to improve the generalization of image captioning models.
Our proposed method trains a captioner to learn from a paired data and to progressively associate unpaired data.
Our extensive and comprehensive empirical results both on (1) image-based and (2) dense region-based captioning datasets followed by comprehensive analysis on the scarcely-paired dataset.
arXiv Detail & Related papers (2023-01-26T15:25:43Z) - Large-scale Unsupervised Semantic Segmentation [163.3568726730319]
We propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to track the research progress.
Based on the ImageNet dataset, we propose the ImageNet-S dataset with 1.2 million training images and 40k high-quality semantic segmentation annotations for evaluation.
arXiv Detail & Related papers (2021-06-06T15:02:11Z) - Unifying Remote Sensing Image Retrieval and Classification with Robust
Fine-tuning [3.6526118822907594]
We aim at unifying remote sensing image retrieval and classification with a new large-scale training and testing dataset, SF300.
We show that our framework systematically achieves a boost of retrieval and classification performance on nine different datasets compared to an ImageNet pretrained baseline.
arXiv Detail & Related papers (2021-02-26T11:01:30Z) - Grafit: Learning fine-grained image representations with coarse labels [114.17782143848315]
This paper tackles the problem of learning a finer representation than the one provided by training labels.
By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.
arXiv Detail & Related papers (2020-11-25T19:06:26Z) - Unsupervised Image Classification for Deep Representation Learning [42.09716669386924]
We propose an unsupervised image classification framework without using embedding clustering.
Experiments on ImageNet dataset have been conducted to prove the effectiveness of our method.
arXiv Detail & Related papers (2020-06-20T02:57:06Z) - SCAN: Learning to Classify Images without Labels [73.69513783788622]
We advocate a two-step approach where feature learning and clustering are decoupled.
A self-supervised task from representation learning is employed to obtain semantically meaningful features.
We obtain promising results on ImageNet, and outperform several semi-supervised learning methods in the low-data regime.
arXiv Detail & Related papers (2020-05-25T18:12:33Z) - Google Landmarks Dataset v2 -- A Large-Scale Benchmark for
Instance-Level Recognition and Retrieval [9.922132565411664]
We introduce the Google Landmarks dataset v2 (GLDv2), a new benchmark for large-scale, fine-grained instance recognition and image retrieval.
GLDv2 is the largest such dataset to date by a large margin, including over 5M images and 200k distinct instance labels.
The dataset is sourced from Wikimedia Commons, the world's largest crowdsourced collection of landmark photos.
arXiv Detail & Related papers (2020-04-03T22:52:17Z) - Reinforced active learning for image segmentation [34.096237671643145]
We present a new active learning strategy for semantic segmentation based on deep reinforcement learning (RL)
An agent learns a policy to select a subset of small informative image regions -- opposed to entire images -- to be labeled from a pool of unlabeled data.
Our method proposes a new modification of the deep Q-network (DQN) formulation for active learning, adapting it to the large-scale nature of semantic segmentation problems.
arXiv Detail & Related papers (2020-02-16T14:03:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.