Improving Resnet-9 Generalization Trained on Small Datasets
- URL: http://arxiv.org/abs/2309.03965v1
- Date: Thu, 7 Sep 2023 18:46:52 GMT
- Title: Improving Resnet-9 Generalization Trained on Small Datasets
- Authors: Omar Mohamed Awad and Habib Hajimolahoseini and Michael Lim and
Gurpreet Gosal and Walid Ahmed and Yang Liu and Gordon Deng
- Abstract summary: The challenge is to achieve the highest possible accuracy in an image classification task in less than 10 minutes.
The training is done on a small dataset of 5000 images picked randomly from CIFAR-10 dataset.
Our experiments show that the ResNet-9 can achieve the accuracy of 88% while trained only on a 10% subset of CIFAR-10 dataset.
- Score: 4.977981835063451
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper presents our proposed approach that won the first prize at the
ICLR competition on Hardware Aware Efficient Training. The challenge is to
achieve the highest possible accuracy in an image classification task in less
than 10 minutes. The training is done on a small dataset of 5000 images picked
randomly from CIFAR-10 dataset. The evaluation is performed by the competition
organizers on a secret dataset with 1000 images of the same size. Our approach
includes applying a series of technique for improving the generalization of
ResNet-9 including: sharpness aware optimization, label smoothing, gradient
centralization, input patch whitening as well as metalearning based training.
Our experiments show that the ResNet-9 can achieve the accuracy of 88% while
trained only on a 10% subset of CIFAR-10 dataset in less than 10 minuets
Related papers
- Effective pruning of web-scale datasets based on complexity of concept
clusters [48.125618324485195]
We present a method for pruning large-scale multimodal datasets for training CLIP-style models on ImageNet.
We find that training on a smaller set of high-quality data can lead to higher performance with significantly lower training costs.
We achieve a new state-of-the-art Imagehttps://info.arxiv.org/help/prep#commentsNet zero-shot accuracy and a competitive average zero-shot accuracy on 38 evaluation tasks.
arXiv Detail & Related papers (2024-01-09T14:32:24Z) - DataDAM: Efficient Dataset Distillation with Attention Matching [15.300968899043498]
Researchers have long tried to minimize training costs in deep learning by maintaining strong generalization across diverse datasets.
Emerging research on dataset aims to reduce training costs by creating a small synthetic set that contains the information of a larger real dataset.
However, the synthetic data generated by previous methods are not guaranteed to distribute and discriminate as well as the original training data.
arXiv Detail & Related papers (2023-09-29T19:07:48Z) - No Data Augmentation? Alternative Regularizations for Effective Training
on Small Datasets [0.0]
We study alternative regularization strategies to push the limits of supervised learning on small image classification datasets.
In particular, we employ a agnostic to select (semi) optimal learning rate and weight decay couples via the norm of model parameters.
We reach a test accuracy of 66.5%, on par with the best state-of-the-art methods.
arXiv Detail & Related papers (2023-09-04T16:13:59Z) - Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory [66.035487142452]
We show that trajectory-matching-based methods (MTT) can scale to large-scale datasets such as ImageNet-1K.
We propose a procedure to exactly compute the unrolled gradient with constant memory complexity, which allows us to scale MTT to ImageNet-1K seamlessly with 6x reduction in memory footprint.
The resulting algorithm sets new SOTA on ImageNet-1K: we can scale up to 50 IPCs (Image Per Class) on ImageNet-1K on a single GPU.
arXiv Detail & Related papers (2022-11-19T04:46:03Z) - To be Critical: Self-Calibrated Weakly Supervised Learning for Salient
Object Detection [95.21700830273221]
Weakly-supervised salient object detection (WSOD) aims to develop saliency models using image-level annotations.
We propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions.
We prove that even a much smaller dataset with well-matched annotations can facilitate models to achieve better performance as well as generalizability.
arXiv Detail & Related papers (2021-09-04T02:45:22Z) - Jigsaw Clustering for Unsupervised Visual Representation Learning [68.09280490213399]
We propose a new jigsaw clustering pretext task in this paper.
Our method makes use of information from both intra- and inter-images.
It is even comparable to the contrastive learning methods when only half of training batches are used.
arXiv Detail & Related papers (2021-04-01T08:09:26Z) - Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then
Training It Toughly [114.81028176850404]
Training generative adversarial networks (GANs) with limited data generally results in deteriorated performance and collapsed models.
We decompose the data-hungry GAN training into two sequential sub-problems.
Such a coordinated framework enables us to focus on lower-complexity and more data-efficient sub-problems.
arXiv Detail & Related papers (2021-02-28T05:20:29Z) - Dataset Condensation with Differentiable Siamese Augmentation [30.571335208276246]
We focus on condensing large training sets into significantly smaller synthetic sets which can be used to train deep neural networks.
We propose Differentiable Siamese Augmentation that enables effective use of data augmentation to synthesize more informative synthetic images.
We show with only less than 1% data that our method achieves 99.6%, 94.9%, 88.5%, 71.5% relative performance on MNIST, FashionMNIST, SVHN, CIFAR10 respectively.
arXiv Detail & Related papers (2021-02-16T16:32:21Z) - Efficient Conditional Pre-training for Transfer Learning [71.01129334495553]
We propose efficient filtering methods to select relevant subsets from the pre-training dataset.
We validate our techniques by pre-training on ImageNet in both the unsupervised and supervised settings.
We improve standard ImageNet pre-training by 1-3% by tuning available models on our subsets and pre-training on a dataset filtered from a larger scale dataset.
arXiv Detail & Related papers (2020-11-20T06:16:15Z) - Picking Winning Tickets Before Training by Preserving Gradient Flow [9.67608102763644]
We argue that efficient training requires preserving the gradient flow through the network.
We empirically investigate the effectiveness of the proposed method with extensive experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet and ImageNet.
arXiv Detail & Related papers (2020-02-18T05:14:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.