Dataset Distillation via Adversarial Prediction Matching
- URL: http://arxiv.org/abs/2312.08912v1
- Date: Thu, 14 Dec 2023 13:19:33 GMT
- Title: Dataset Distillation via Adversarial Prediction Matching
- Authors: Mingyang Chen, Bo Huang, Junda Lu, Bing Li, Yi Wang, Minhao Cheng, Wei
Wang
- Abstract summary: We propose an adversarial framework to solve the dataset distillation problem efficiently.
Our method can produce synthetic datasets just 10% the size of the original, yet achieve, on average, 94% of the test accuracy of models trained on the full original datasets.
- Score: 24.487950991247764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dataset distillation is the technique of synthesizing smaller condensed
datasets from large original datasets while retaining necessary information to
persist the effect. In this paper, we approach the dataset distillation problem
from a novel perspective: we regard minimizing the prediction discrepancy on
the real data distribution between models, which are respectively trained on
the large original dataset and on the small distilled dataset, as a conduit for
condensing information from the raw data into the distilled version. An
adversarial framework is proposed to solve the problem efficiently. In contrast
to existing distillation methods involving nested optimization or long-range
gradient unrolling, our approach hinges on single-level optimization. This
ensures the memory efficiency of our method and provides a flexible tradeoff
between time and memory budgets, allowing us to distil ImageNet-1K using a
minimum of only 6.5GB of GPU memory. Under the optimal tradeoff strategy, it
requires only 2.5$\times$ less memory and 5$\times$ less runtime compared to
the state-of-the-art. Empirically, our method can produce synthetic datasets
just 10% the size of the original, yet achieve, on average, 94% of the test
accuracy of models trained on the full original datasets including ImageNet-1K,
significantly surpassing state-of-the-art. Additionally, extensive tests reveal
that our distilled datasets excel in cross-architecture generalization
capabilities.
Related papers
- Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching [74.75248610868685]
Teddy is a Taylor-approximated dataset distillation framework designed to handle large-scale dataset.
Teddy attains state-of-the-art efficiency and performance on the Tiny-ImageNet and original-sized ImageNet-1K dataset.
arXiv Detail & Related papers (2024-10-10T03:28:46Z) - Prioritize Alignment in Dataset Distillation [27.71563788300818]
Existing methods use the agent model to extract information from the target dataset and embed it into the distilled dataset.
We find that existing methods introduce misaligned information in both information extraction and embedding stages.
We propose Prioritize Alignment in dataset Distillation (PAD), which aligns information from the following two perspectives.
arXiv Detail & Related papers (2024-08-06T17:07:28Z) - Importance-Aware Adaptive Dataset Distillation [53.79746115426363]
Development of deep learning models is enabled by the availability of large-scale datasets.
dataset distillation aims to synthesize a compact dataset that retains the essential information from the large original dataset.
We propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance.
arXiv Detail & Related papers (2024-01-29T03:29:39Z) - DataDAM: Efficient Dataset Distillation with Attention Matching [15.300968899043498]
Researchers have long tried to minimize training costs in deep learning by maintaining strong generalization across diverse datasets.
Emerging research on dataset aims to reduce training costs by creating a small synthetic set that contains the information of a larger real dataset.
However, the synthetic data generated by previous methods are not guaranteed to distribute and discriminate as well as the original training data.
arXiv Detail & Related papers (2023-09-29T19:07:48Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Distill Gold from Massive Ores: Bi-level Data Pruning towards Efficient Dataset Distillation [96.92250565207017]
We study the data efficiency and selection for the dataset distillation task.
By re-formulating the dynamics of distillation, we provide insight into the inherent redundancy in the real dataset.
We find the most contributing samples based on their causal effects on the distillation.
arXiv Detail & Related papers (2023-05-28T06:53:41Z) - Generalizing Dataset Distillation via Deep Generative Prior [75.9031209877651]
We propose to distill an entire dataset's knowledge into a few synthetic images.
The idea is to synthesize a small number of synthetic data points that, when given to a learning algorithm as training data, result in a model approximating one trained on the original data.
We present a new optimization algorithm that distills a large number of images into a few intermediate feature vectors in the generative model's latent space.
arXiv Detail & Related papers (2023-05-02T17:59:31Z) - Minimizing the Accumulated Trajectory Error to Improve Dataset
Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory.
Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z) - Dataset Distillation using Neural Feature Regression [32.53291298089172]
We develop an algorithm for dataset distillation using neural Feature Regression with Pooling (FRePo)
FRePo achieves state-of-the-art performance with an order of magnitude less memory requirement and two orders of magnitude faster training than previous methods.
We show that high-quality distilled data can greatly improve various downstream applications, such as continual learning and membership inference defense.
arXiv Detail & Related papers (2022-06-01T19:02:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.