Dataset Distillation via Adversarial Prediction Matching
- URL: http://arxiv.org/abs/2312.08912v1
- Date: Thu, 14 Dec 2023 13:19:33 GMT
- Title: Dataset Distillation via Adversarial Prediction Matching
- Authors: Mingyang Chen, Bo Huang, Junda Lu, Bing Li, Yi Wang, Minhao Cheng, Wei
Wang
- Abstract summary: We propose an adversarial framework to solve the dataset distillation problem efficiently.
Our method can produce synthetic datasets just 10% the size of the original, yet achieve, on average, 94% of the test accuracy of models trained on the full original datasets.
- Score: 24.487950991247764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dataset distillation is the technique of synthesizing smaller condensed
datasets from large original datasets while retaining necessary information to
persist the effect. In this paper, we approach the dataset distillation problem
from a novel perspective: we regard minimizing the prediction discrepancy on
the real data distribution between models, which are respectively trained on
the large original dataset and on the small distilled dataset, as a conduit for
condensing information from the raw data into the distilled version. An
adversarial framework is proposed to solve the problem efficiently. In contrast
to existing distillation methods involving nested optimization or long-range
gradient unrolling, our approach hinges on single-level optimization. This
ensures the memory efficiency of our method and provides a flexible tradeoff
between time and memory budgets, allowing us to distil ImageNet-1K using a
minimum of only 6.5GB of GPU memory. Under the optimal tradeoff strategy, it
requires only 2.5$\times$ less memory and 5$\times$ less runtime compared to
the state-of-the-art. Empirically, our method can produce synthetic datasets
just 10% the size of the original, yet achieve, on average, 94% of the test
accuracy of models trained on the full original datasets including ImageNet-1K,
significantly surpassing state-of-the-art. Additionally, extensive tests reveal
that our distilled datasets excel in cross-architecture generalization
capabilities.
Related papers
- Dataset Distillation as Pushforward Optimal Quantization [1.039189397779466]
We propose a simple extension of the state-of-the-art data distillation method D4M, achieving better performance on the ImageNet-1K dataset with trivial additional computation.
We demonstrate that when equipped with an encoder-decoder structure, the empirically successful disentangled methods can be reformulated as an optimal quantization problem.
In particular, we link existing disentangled dataset distillation methods to the classical optimal quantization and Wasserstein barycenter problems, demonstrating consistency of distilled datasets for diffusion-based generative priors.
arXiv Detail & Related papers (2025-01-13T20:41:52Z) - Dataset Distillation via Committee Voting [21.018818924580877]
We introduce $bf C$ommittee $bf V$oting for $bf D$ataset $bf D$istillation (CV-DD)
CV-DD is a novel approach that leverages the collective wisdom of multiple models or experts to create high-quality distilled datasets.
arXiv Detail & Related papers (2025-01-13T18:59:48Z) - Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching [74.75248610868685]
Teddy is a Taylor-approximated dataset distillation framework designed to handle large-scale dataset.
Teddy attains state-of-the-art efficiency and performance on the Tiny-ImageNet and original-sized ImageNet-1K dataset.
arXiv Detail & Related papers (2024-10-10T03:28:46Z) - Importance-Aware Adaptive Dataset Distillation [53.79746115426363]
Development of deep learning models is enabled by the availability of large-scale datasets.
dataset distillation aims to synthesize a compact dataset that retains the essential information from the large original dataset.
We propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance.
arXiv Detail & Related papers (2024-01-29T03:29:39Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Distill Gold from Massive Ores: Bi-level Data Pruning towards Efficient Dataset Distillation [96.92250565207017]
We study the data efficiency and selection for the dataset distillation task.
By re-formulating the dynamics of distillation, we provide insight into the inherent redundancy in the real dataset.
We find the most contributing samples based on their causal effects on the distillation.
arXiv Detail & Related papers (2023-05-28T06:53:41Z) - Generalizing Dataset Distillation via Deep Generative Prior [75.9031209877651]
We propose to distill an entire dataset's knowledge into a few synthetic images.
The idea is to synthesize a small number of synthetic data points that, when given to a learning algorithm as training data, result in a model approximating one trained on the original data.
We present a new optimization algorithm that distills a large number of images into a few intermediate feature vectors in the generative model's latent space.
arXiv Detail & Related papers (2023-05-02T17:59:31Z) - Minimizing the Accumulated Trajectory Error to Improve Dataset
Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory.
Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z) - Dataset Distillation using Neural Feature Regression [32.53291298089172]
We develop an algorithm for dataset distillation using neural Feature Regression with Pooling (FRePo)
FRePo achieves state-of-the-art performance with an order of magnitude less memory requirement and two orders of magnitude faster training than previous methods.
We show that high-quality distilled data can greatly improve various downstream applications, such as continual learning and membership inference defense.
arXiv Detail & Related papers (2022-06-01T19:02:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.