Distill the Best, Ignore the Rest: Improving Dataset Distillation with Loss-Value-Based Pruning
- URL: http://arxiv.org/abs/2411.12115v1
- Date: Mon, 18 Nov 2024 22:51:44 GMT
- Title: Distill the Best, Ignore the Rest: Improving Dataset Distillation with Loss-Value-Based Pruning
- Authors: Brian B. Moser, Federico Raue, Tobias C. Nauen, Stanislav Frolov, Andreas Dengel,
- Abstract summary: "Prune First, Distill After" framework prunes datasets via loss-based sampling prior to distillation.
Our proposed framework significantly boosts distilled quality, achieving up to a 5.2 percentage points accuracy increase.
- Score: 8.69908615905782
- License:
- Abstract: Dataset distillation has gained significant interest in recent years, yet existing approaches typically distill from the entire dataset, potentially including non-beneficial samples. We introduce a novel "Prune First, Distill After" framework that systematically prunes datasets via loss-based sampling prior to distillation. By leveraging pruning before classical distillation techniques and generative priors, we create a representative core-set that leads to enhanced generalization for unseen architectures - a significant challenge of current distillation methods. More specifically, our proposed framework significantly boosts distilled quality, achieving up to a 5.2 percentage points accuracy increase even with substantial dataset pruning, i.e., removing 80% of the original dataset prior to distillation. Overall, our experimental results highlight the advantages of our easy-sample prioritization and cross-architecture robustness, paving the way for more effective and high-quality dataset distillation.
Related papers
- Label-Augmented Dataset Distillation [13.449340904911725]
We introduce Label-Augmented dataset Distillation (LADD) to enhance dataset distillation with label augmentations.
LADD sub-samples each synthetic image, generating additional dense labels to capture rich semantics.
With three high-performance dataset distillation algorithms, LADD achieves remarkable gains by an average of 14.9% in accuracy.
arXiv Detail & Related papers (2024-09-24T16:54:22Z) - Mitigating Bias in Dataset Distillation [62.79454960378792]
We study the impact of bias inside the original dataset on the performance of dataset distillation.
We introduce a simple yet highly effective approach based on a sample reweighting scheme utilizing kernel density estimation.
arXiv Detail & Related papers (2024-06-06T18:52:28Z) - Practical Dataset Distillation Based on Deep Support Vectors [27.16222034423108]
In this paper, we focus on dataset distillation in practical scenarios with access to only a fraction of the entire dataset.
We introduce a novel distillation method that augments the conventional process by incorporating general model knowledge via the addition of Deep KKT (DKKT) loss.
In practical settings, our approach showed improved performance compared to the baseline distribution matching distillation method on the CIFAR-10 dataset.
arXiv Detail & Related papers (2024-05-01T06:41:27Z) - DD-RobustBench: An Adversarial Robustness Benchmark for Dataset Distillation [25.754877176280708]
We introduce a comprehensive benchmark that is the most extensive to date for evaluating the adversarial robustness of distilled datasets in a unified way.
Our benchmark significantly expands upon prior efforts by incorporating the latest advancements such as TESLA and SRe2L.
We also discovered that incorporating distilled data into the training batches of the original dataset can yield to improvement of robustness.
arXiv Detail & Related papers (2024-03-20T06:00:53Z) - Importance-Aware Adaptive Dataset Distillation [53.79746115426363]
Development of deep learning models is enabled by the availability of large-scale datasets.
dataset distillation aims to synthesize a compact dataset that retains the essential information from the large original dataset.
We propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance.
arXiv Detail & Related papers (2024-01-29T03:29:39Z) - Data Distillation Can Be Like Vodka: Distilling More Times For Better
Quality [78.6359306550245]
We argue that using just one synthetic subset for distillation will not yield optimal generalization performance.
PDD synthesizes multiple small sets of synthetic images, each conditioned on the previous sets, and trains the model on the cumulative union of these subsets.
Our experiments show that PDD can effectively improve the performance of existing dataset distillation methods by up to 4.3%.
arXiv Detail & Related papers (2023-10-10T20:04:44Z) - Distill Gold from Massive Ores: Bi-level Data Pruning towards Efficient Dataset Distillation [96.92250565207017]
We study the data efficiency and selection for the dataset distillation task.
By re-formulating the dynamics of distillation, we provide insight into the inherent redundancy in the real dataset.
We find the most contributing samples based on their causal effects on the distillation.
arXiv Detail & Related papers (2023-05-28T06:53:41Z) - Generalizing Dataset Distillation via Deep Generative Prior [75.9031209877651]
We propose to distill an entire dataset's knowledge into a few synthetic images.
The idea is to synthesize a small number of synthetic data points that, when given to a learning algorithm as training data, result in a model approximating one trained on the original data.
We present a new optimization algorithm that distills a large number of images into a few intermediate feature vectors in the generative model's latent space.
arXiv Detail & Related papers (2023-05-02T17:59:31Z) - Explicit and Implicit Knowledge Distillation via Unlabeled Data [5.702176304876537]
We propose an efficient unlabeled sample selection method to replace high computational generators.
We also propose a class-dropping mechanism to suppress the label noise caused by the data domain shifts.
Experimental results show that our method can quickly converge and obtain higher accuracy than other state-of-the-art methods.
arXiv Detail & Related papers (2023-02-17T09:10:41Z) - Dataset Distillation Using Parameter Pruning [53.79746115426363]
The proposed method can synthesize more robust distilled datasets and improve distillation performance by pruning difficult-to-match parameters during the distillation process.
Experimental results on two benchmark datasets show the superiority of the proposed method.
arXiv Detail & Related papers (2022-09-29T07:58:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.