Related papers: Dataset Distillation Using Parameter Pruning

Dataset Distillation Using Parameter Pruning

URL: http://arxiv.org/abs/2209.14609v6
Date: Mon, 21 Aug 2023 03:15:35 GMT
Title: Dataset Distillation Using Parameter Pruning
Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama
Abstract summary: The proposed method can synthesize more robust distilled datasets and improve distillation performance by pruning difficult-to-match parameters during the distillation process. Experimental results on two benchmark datasets show the superiority of the proposed method.
Score: 53.79746115426363
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this study, we propose a novel dataset distillation method based on parameter pruning. The proposed method can synthesize more robust distilled datasets and improve distillation performance by pruning difficult-to-match parameters during the distillation process. Experimental results on two benchmark datasets show the superiority of the proposed method.

Related papers

DD-Ranking: Rethinking the Evaluation of Dataset Distillation [223.28392857127733]
We propose DD-Ranking, a unified evaluation framework, along with new general evaluation metrics to uncover the true performance improvements achieved by different methods.<n>By refocusing on the actual information enhancement of distilled datasets, DD-Ranking provides a more comprehensive and fair evaluation standard for future research advancements.
arXiv Detail & Related papers (2025-05-19T16:19:50Z)
Robust Dataset Distillation by Matching Adversarial Trajectories [21.52323435014135]
We introduce the task of robust dataset distillation", a novel paradigm that embeds adversarial robustness into synthetic datasets during the distillation process. We propose Matching Adversarial Trajectories (MAT), a method that integrates adversarial training into trajectory-based dataset distillation. MAT incorporates adversarial samples during trajectory generation to obtain robust training trajectories, which are then used to guide the distillation process.
arXiv Detail & Related papers (2025-03-15T10:02:38Z)
Generative Dataset Distillation Based on Self-knowledge Distillation [49.20086587208214]
We present a novel generative dataset distillation method that can improve the accuracy of aligning prediction logits. Our approach integrates self-knowledge distillation to achieve more precise distribution matching between the synthetic and original data. Our method outperforms existing state-of-the-art methods, resulting in superior distillation performance.
arXiv Detail & Related papers (2025-01-08T00:43:31Z)
Inference-Time Diffusion Model Distillation [59.350789627086456]
We introduce Distillation++, a novel inference-time distillation framework. Inspired by recent advances in conditional sampling, our approach recasts student model sampling as a proximal optimization problem. We integrate distillation optimization during reverse sampling, which can be viewed as teacher guidance.
arXiv Detail & Related papers (2024-12-12T02:07:17Z)
Distill the Best, Ignore the Rest: Improving Dataset Distillation with Loss-Value-Based Pruning [8.69908615905782]
"Prune First, Distill After" framework prunes datasets via loss-based sampling prior to distillation. Our proposed framework significantly boosts distilled quality, achieving up to a 5.2 percentage points accuracy increase.
arXiv Detail & Related papers (2024-11-18T22:51:44Z)
Hierarchical Features Matter: A Deep Exploration of GAN Priors for Improved Dataset Distillation [51.44054828384487]
We propose a novel parameterization method dubbed Hierarchical Generative Latent Distillation (H-GLaD) This method systematically explores hierarchical layers within the generative adversarial networks (GANs) In addition, we introduce a novel class-relevant feature distance metric to alleviate the computational burden associated with synthetic dataset evaluation.
arXiv Detail & Related papers (2024-06-09T09:15:54Z)
Exploring the potential of prototype-based soft-labels data distillation for imbalanced data classification [0.0]
Main goal is to push further the performance of prototype-based soft-labels distillation in terms of classification accuracy. Experimental studies trace the capability of the method to distill the data, but also the opportunity to act as an augmentation method.
arXiv Detail & Related papers (2024-03-25T19:15:19Z)
Importance-Aware Adaptive Dataset Distillation [53.79746115426363]
Development of deep learning models is enabled by the availability of large-scale datasets. dataset distillation aims to synthesize a compact dataset that retains the essential information from the large original dataset. We propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance.
arXiv Detail & Related papers (2024-01-29T03:29:39Z)
Distill Gold from Massive Ores: Bi-level Data Pruning towards Efficient Dataset Distillation [96.92250565207017]
We study the data efficiency and selection for the dataset distillation task. By re-formulating the dynamics of distillation, we provide insight into the inherent redundancy in the real dataset. We find the most contributing samples based on their causal effects on the distillation.
arXiv Detail & Related papers (2023-05-28T06:53:41Z)
Explicit and Implicit Knowledge Distillation via Unlabeled Data [5.702176304876537]
We propose an efficient unlabeled sample selection method to replace high computational generators. We also propose a class-dropping mechanism to suppress the label noise caused by the data domain shifts. Experimental results show that our method can quickly converge and obtain higher accuracy than other state-of-the-art methods.
arXiv Detail & Related papers (2023-02-17T09:10:41Z)
Dataset Distillation by Matching Training Trajectories [75.9031209877651]
We propose a new formulation that optimize our distilled data to guide networks to a similar state as those trained on real data. Given a network, we train it for several iterations on our distilled data and optimize the distilled data with respect to the distance between the synthetically trained parameters and the parameters trained on real data. Our method handily outperforms existing methods and also allows us to distill higher-resolution visual data.
arXiv Detail & Related papers (2022-03-22T17:58:59Z)
New Properties of the Data Distillation Method When Working With Tabular Data [77.34726150561087]
Data distillation is the problem of reducing the volume oftraining data while keeping only the necessary information. We show that the model trained on distilled samples can outperform the model trained on the original dataset.
arXiv Detail & Related papers (2020-10-19T20:27:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.