Related papers: No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets

No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets

URL: http://arxiv.org/abs/2309.01694v1
Date: Mon, 4 Sep 2023 16:13:59 GMT
Title: No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets
Authors: Lorenzo Brigato and Stavroula Mougiakakou
Abstract summary: We study alternative regularization strategies to push the limits of supervised learning on small image classification datasets. In particular, we employ a agnostic to select (semi) optimal learning rate and weight decay couples via the norm of model parameters. We reach a test accuracy of 66.5%, on par with the best state-of-the-art methods.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Solving image classification tasks given small training datasets remains an open challenge for modern computer vision. Aggressive data augmentation and generative models are among the most straightforward approaches to overcoming the lack of data. However, the first fails to be agnostic to varying image domains, while the latter requires additional compute and careful design. In this work, we study alternative regularization strategies to push the limits of supervised learning on small image classification datasets. In particular, along with the model size and training schedule scaling, we employ a heuristic to select (semi) optimal learning rate and weight decay couples via the norm of model parameters. By training on only 1% of the original CIFAR-10 training set (i.e., 50 images per class) and testing on ciFAIR-10, a variant of the original CIFAR without duplicated images, we reach a test accuracy of 66.5%, on par with the best state-of-the-art methods.

Related papers

Self-Supervised Learning in Deep Networks: A Pathway to Robust Few-Shot Classification [0.0]
We first pre-train the model with self-supervision to enable it to learn common feature expressions on a large amount of unlabeled data. Then fine-tune it on the few-shot dataset Mini-ImageNet to improve the model's accuracy and generalization ability under limited data.
arXiv Detail & Related papers (2024-11-19T01:01:56Z)
A Simple and Efficient Baseline for Data Attribution on Images [107.12337511216228]
Current state-of-the-art approaches require a large ensemble of as many as 300,000 models to accurately attribute model predictions. In this work, we focus on a minimalist baseline, utilizing the feature space of a backbone pretrained via self-supervised learning to perform data attribution. Our method is model-agnostic and scales easily to large datasets.
arXiv Detail & Related papers (2023-11-03T17:29:46Z)
Delving Deeper into Data Scaling in Masked Image Modeling [145.36501330782357]
We conduct an empirical study on the scaling capability of masked image modeling (MIM) methods for visual recognition. Specifically, we utilize the web-collected Coyo-700M dataset. Our goal is to investigate how the performance changes on downstream tasks when scaling with different sizes of data and models.
arXiv Detail & Related papers (2023-05-24T15:33:46Z)
Boosting Visual-Language Models by Exploiting Hard Samples [126.35125029639168]
HELIP is a cost-effective strategy tailored to enhance the performance of existing CLIP models. Our method allows for effortless integration with existing models' training pipelines. On comprehensive benchmarks, HELIP consistently boosts existing models to achieve leading performance.
arXiv Detail & Related papers (2023-05-09T07:00:17Z)
The effectiveness of MAE pre-pretraining for billion-scale pretraining [65.98338857597935]
We introduce an additional pre-pretraining stage that is simple and uses the self-supervised MAE technique to initialize the model. We measure the effectiveness of pre-pretraining on 10 different visual recognition tasks spanning image classification, video recognition, object detection, low-shot classification and zero-shot recognition.
arXiv Detail & Related papers (2023-03-23T17:56:12Z)
LiT Tuned Models for Efficient Species Detection [22.3395465641384]
Our paper introduces a simple methodology for adapting any fine-grained image classification dataset for distributed vision-language pretraining. We implement this methodology on the challenging iNaturalist-2021 dataset, comprised of approximately 2.7 million images of macro-organisms across 10,000 classes. Our model (trained using a new method called locked-image text tuning) uses a pre-trained, frozen vision representation, proving that language alignment alone can attain strong transfer learning performance.
arXiv Detail & Related papers (2023-02-12T20:36:55Z)
EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones [80.662250618795]
This paper presents a new curriculum learning approach for the efficient training of visual backbones (e.g., vision Transformers) As an off-the-shelf method, it reduces the wall-time training cost of a wide variety of popular models by >1.5x on ImageNet-1K/22K without sacrificing accuracy.
arXiv Detail & Related papers (2022-11-17T17:38:55Z)
If your data distribution shifts, use self-learning [24.23584770840611]
Self-learning techniques like entropy and pseudo-labeling are simple and effective at improving performance of a deployed computer vision model under systematic domain shifts. We conduct a wide range of large-scale experiments and show consistent improvements irrespective of the model architecture.
arXiv Detail & Related papers (2021-04-27T01:02:15Z)
Jigsaw Clustering for Unsupervised Visual Representation Learning [68.09280490213399]
We propose a new jigsaw clustering pretext task in this paper. Our method makes use of information from both intra- and inter-images. It is even comparable to the contrastive learning methods when only half of training batches are used.
arXiv Detail & Related papers (2021-04-01T08:09:26Z)
Multiclass non-Adversarial Image Synthesis, with Application to Classification from Very Small Sample [6.243995448840211]
We present a novel non-adversarial generative method - Clustered Optimization of LAtent space (COLA) In the full data regime, our method is capable of generating diverse multi-class images with no supervision. In the small-data regime, where only a small sample of labeled images is available for training with no access to additional unlabeled data, our results surpass state-of-the-art GAN models trained on the same amount of data.
arXiv Detail & Related papers (2020-11-25T18:47:27Z)
Background Splitting: Finding Rare Classes in a Sea of Background [55.03789745276442]
We focus on the real-world problem of training accurate deep models for image classification of a small number of rare categories. In these scenarios, almost all images belong to the background category in the dataset (>95% of the dataset is background) We demonstrate that both standard fine-tuning approaches and state-of-the-art approaches for training on imbalanced datasets do not produce accurate deep models in the presence of this extreme imbalance.
arXiv Detail & Related papers (2020-08-28T23:05:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.