Squeeze, Recover and Relabel: Dataset Condensation at ImageNet Scale
From A New Perspective
- URL: http://arxiv.org/abs/2306.13092v3
- Date: Sun, 11 Feb 2024 20:34:51 GMT
- Title: Squeeze, Recover and Relabel: Dataset Condensation at ImageNet Scale
From A New Perspective
- Authors: Zeyuan Yin and Eric Xing and Zhiqiang Shen
- Abstract summary: Under 50 IPC, our approach achieves the highest 42.5% and 60.8% validation accuracy on Tiny-ImageNet and ImageNet-1K datasets.
Our approach also surpasses MTT in terms of speed by approximately 52$times$ (ConvNet-4) and 16$times$ (ResNet-18) faster with less memory consumption of 11.6$times$ and 6.4$times$ during data synthesis.
- Score: 27.650434284271363
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a new dataset condensation framework termed Squeeze, Recover and
Relabel (SRe$^2$L) that decouples the bilevel optimization of model and
synthetic data during training, to handle varying scales of datasets, model
architectures and image resolutions for efficient dataset condensation. The
proposed method demonstrates flexibility across diverse dataset scales and
exhibits multiple advantages in terms of arbitrary resolutions of synthesized
images, low training cost and memory consumption with high-resolution
synthesis, and the ability to scale up to arbitrary evaluation network
architectures. Extensive experiments are conducted on Tiny-ImageNet and full
ImageNet-1K datasets. Under 50 IPC, our approach achieves the highest 42.5% and
60.8% validation accuracy on Tiny-ImageNet and ImageNet-1K, outperforming all
previous state-of-the-art methods by margins of 14.5% and 32.9%, respectively.
Our approach also surpasses MTT in terms of speed by approximately 52$\times$
(ConvNet-4) and 16$\times$ (ResNet-18) faster with less memory consumption of
11.6$\times$ and 6.4$\times$ during data synthesis. Our code and condensed
datasets of 50, 200 IPC with 4K recovery budget are available at
https://github.com/VILA-Lab/SRe2L.
Related papers
- Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching [74.75248610868685]
Teddy is a Taylor-approximated dataset distillation framework designed to handle large-scale dataset.
Teddy attains state-of-the-art efficiency and performance on the Tiny-ImageNet and original-sized ImageNet-1K dataset.
arXiv Detail & Related papers (2024-10-10T03:28:46Z) - Effective pruning of web-scale datasets based on complexity of concept
clusters [48.125618324485195]
We present a method for pruning large-scale multimodal datasets for training CLIP-style models on ImageNet.
We find that training on a smaller set of high-quality data can lead to higher performance with significantly lower training costs.
We achieve a new state-of-the-art Imagehttps://info.arxiv.org/help/prep#commentsNet zero-shot accuracy and a competitive average zero-shot accuracy on 38 evaluation tasks.
arXiv Detail & Related papers (2024-01-09T14:32:24Z) - Dataset Distillation via Adversarial Prediction Matching [24.487950991247764]
We propose an adversarial framework to solve the dataset distillation problem efficiently.
Our method can produce synthetic datasets just 10% the size of the original, yet achieve, on average, 94% of the test accuracy of models trained on the full original datasets.
arXiv Detail & Related papers (2023-12-14T13:19:33Z) - Dataset Distillation via Curriculum Data Synthesis in Large Data Era [26.883100340763317]
We introduce a simple yet effective global-to-local gradient refinement approach enabled by curriculum data augmentation during data synthesis.
The proposed model outperforms the current state-of-the-art methods like SRe$2$L, TESLA, and MTT by more than 4% Top-1 accuracy on ImageNet-1K/21K and for the first time, reduces the gap to its full-data training counterparts to less than absolute 15%.
arXiv Detail & Related papers (2023-11-30T18:59:56Z) - Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching [24.45182507244476]
Generalized Various Backbone and Statistical Matching (G-VBSM) is first algorithm to obtain strong performance across both small-scale and large-scale datasets.
G-VBSM achieves a performance of 38.7% on CIFAR-100 with 128-width ConvNet, 47.6% on Tiny-ImageNet with ResNet18, and 31.4% on the full 224x224 ImageNet-1k with ResNet18.
arXiv Detail & Related papers (2023-11-29T06:25:59Z) - DataDAM: Efficient Dataset Distillation with Attention Matching [15.300968899043498]
Researchers have long tried to minimize training costs in deep learning by maintaining strong generalization across diverse datasets.
Emerging research on dataset aims to reduce training costs by creating a small synthetic set that contains the information of a larger real dataset.
However, the synthetic data generated by previous methods are not guaranteed to distribute and discriminate as well as the original training data.
arXiv Detail & Related papers (2023-09-29T19:07:48Z) - {\mu}Split: efficient image decomposition for microscopy data [50.794670705085835]
muSplit is a dedicated approach for trained image decomposition in the context of fluorescence microscopy images.
We introduce lateral contextualization (LC), a novel meta-architecture that enables the memory efficient incorporation of large image-context.
We apply muSplit to five decomposition tasks, one on a synthetic dataset, four others derived from real microscopy data.
arXiv Detail & Related papers (2022-11-23T11:26:24Z) - Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory [66.035487142452]
We show that trajectory-matching-based methods (MTT) can scale to large-scale datasets such as ImageNet-1K.
We propose a procedure to exactly compute the unrolled gradient with constant memory complexity, which allows us to scale MTT to ImageNet-1K seamlessly with 6x reduction in memory footprint.
The resulting algorithm sets new SOTA on ImageNet-1K: we can scale up to 50 IPCs (Image Per Class) on ImageNet-1K on a single GPU.
arXiv Detail & Related papers (2022-11-19T04:46:03Z) - Focal Modulation Networks [105.93086472906765]
Self-attention (SA) is completely replaced by focal modulation network (FocalNet)
FocalNets with tiny and base sizes achieve 82.3% and 83.9% top-1 accuracy on ImageNet-1K.
FocalNets exhibit remarkable superiority when transferred to downstream tasks.
arXiv Detail & Related papers (2022-03-22T17:54:50Z) - Post-training deep neural network pruning via layer-wise calibration [70.65691136625514]
We propose a data-free extension of the approach for computer vision models based on automatically-generated synthetic fractal images.
When using real data, we are able to get a ResNet50 model on ImageNet with 65% sparsity rate in 8-bit precision in a post-training setting.
arXiv Detail & Related papers (2021-04-30T14:20:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.