Towards Principled Dataset Distillation: A Spectral Distribution Perspective
- URL: http://arxiv.org/abs/2603.01698v1
- Date: Mon, 02 Mar 2026 10:26:49 GMT
- Title: Towards Principled Dataset Distillation: A Spectral Distribution Perspective
- Authors: Ruixi Wu, Shaobo Wang, Jiahuan Chen, Zhiyuan Liu, Yicun Yang, Zhaorun Chen, Zekai Li, Kaixin Li, Xinming Wang, Hongzhu Yi, Kai Wang, Linfeng Zhang,
- Abstract summary: We propose Class-Aware Spectral Distribution Matching (MCSD), which reformulates distribution alignment via the spectrum of a well-behaved kernel function.<n>On CIFAR-10-LT, with 10 images per class, CSDM achieves a 14.0% improvement over state-of-the-art DD methods, with only a 5.7% performance drop when the number of images in tail classes decreases.
- Score: 29.986767000752753
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dataset distillation (DD) aims to compress large-scale datasets into compact synthetic counterparts for efficient model training. However, existing DD methods exhibit substantial performance degradation on long-tailed datasets. We identify two fundamental challenges: heuristic design choices for distribution discrepancy measure and uniform treatment of imbalanced classes. To address these limitations, we propose Class-Aware Spectral Distribution Matching (CSDM), which reformulates distribution alignment via the spectrum of a well-behaved kernel function. This technique maps the original samples into frequency space, resulting in the Spectral Distribution Distance (SDD). To mitigate class imbalance, we exploit the unified form of SDD to perform amplitude-phase decomposition, which adaptively prioritizes the realism in tail classes. On CIFAR-10-LT, with 10 images per class, CSDM achieves a 14.0% improvement over state-of-the-art DD methods, with only a 5.7% performance drop when the number of images in tail classes decreases from 500 to 25, demonstrating strong stability on long-tailed data.
Related papers
- CoDA: From Text-to-Image Diffusion Models to Training-Free Dataset Distillation [71.52209438343928]
Core Distribution Alignment (CoDA) is a framework that enables effective Distillation (DD) using only an off-the-shelf text-to-image model.<n>Our key idea is to first identify the "intrinsic core distribution" of the target dataset using a robust density-based discovery mechanism.<n>By doing so, CoDA effectively bridges the gap between general-purpose generative priors and target semantics.
arXiv Detail & Related papers (2025-12-03T14:45:57Z) - TGDD: Trajectory Guided Dataset Distillation with Balanced Distribution [22.720901808326122]
We propose Trajectory Guided dataset Distillation (TGDD), which reformulates distribution matching as a dynamic alignment process.<n>At each training stage, TGDD captures evolving semantics by aligning the feature distribution between the synthetic and original dataset.<n>Experiments on ten datasets demonstrate that TGDD achieves state-of-the-art performance, notably a 5.0% accuracy gain on high-resolution benchmarks.
arXiv Detail & Related papers (2025-12-02T07:00:07Z) - Rethinking Long-tailed Dataset Distillation: A Uni-Level Framework with Unbiased Recovery and Relabeling [105.8570596633629]
We rethink long-tailed dataset distillation by revisiting the limitations of trajectory-based methods.<n>We adopt the statistical alignment perspective to jointly model bias and restore fair supervision.<n>Our approach improves top-1 accuracy by 15.6% on CIFAR-100-LT and 11.8% on Tiny-ImageNet-LT.
arXiv Detail & Related papers (2025-11-24T07:57:01Z) - Rectifying Soft-Label Entangled Bias in Long-Tailed Dataset Distillation [39.47633542394261]
We emphasize the critical role of soft labels in long-tailed dataset distillation.<n>We derive an imbalance-aware generalization bound for model trained on distilled dataset.<n>We then identify two primary sources of soft-label bias, which originate from the distillation model and the distilled images.<n>We propose ADSA, an Adaptive Soft-label Alignment module that calibrates the entangled biases.
arXiv Detail & Related papers (2025-11-22T04:37:27Z) - Hyperbolic Dataset Distillation [44.63243875072762]
We propose a novel hyperbolic dataset distillation method.<n>We find that pruning in hyperbolic space requires only 20% of the distilled core set to retain model performance.<n>This is the first work to incorporate the hyperbolic space into the dataset distillation process.
arXiv Detail & Related papers (2025-05-30T14:14:00Z) - Dataset Distillation as Pushforward Optimal Quantization [2.5892916589735457]
We propose a synthetic training set that achieves similar performance to training on real data, with orders of magnitude less computational requirements.<n>In particular, we link existing disentangled dataset distillation methods to the classical optimal quantization and Wasserstein barycenter problems.<n>We achieve better performance and inter-model generalization on the ImageNet-1K dataset with trivial additional computation, and SOTA performance in higher image-per-class settings.
arXiv Detail & Related papers (2025-01-13T20:41:52Z) - Class-Balancing Diffusion Models [57.38599989220613]
Class-Balancing Diffusion Models (CBDM) are trained with a distribution adjustment regularizer as a solution.
Our method benchmarked the generation results on CIFAR100/CIFAR100LT dataset and shows outstanding performance on the downstream recognition task.
arXiv Detail & Related papers (2023-04-30T20:00:14Z) - Improving GANs for Long-Tailed Data through Group Spectral
Regularization [51.58250647277375]
We propose a novel group Spectral Regularizer (gSR) that prevents the spectral explosion alleviating mode collapse.
We find that gSR effectively combines with existing augmentation and regularization techniques, leading to state-of-the-art image generation performance on long-tailed data.
arXiv Detail & Related papers (2022-08-21T17:51:05Z) - Scale-Equivalent Distillation for Semi-Supervised Object Detection [57.59525453301374]
Recent Semi-Supervised Object Detection (SS-OD) methods are mainly based on self-training, generating hard pseudo-labels by a teacher model on unlabeled data as supervisory signals.
We analyze the challenges these methods meet with the empirical experiment results.
We introduce a novel approach, Scale-Equivalent Distillation (SED), which is a simple yet effective end-to-end knowledge distillation framework robust to large object size variance and class imbalance.
arXiv Detail & Related papers (2022-03-23T07:33:37Z) - Fine-grained Data Distribution Alignment for Post-Training Quantization [100.82928284439271]
We propose a fine-grained data distribution alignment (FDDA) method to boost the performance of post-training quantization.
Our method shows the state-of-the-art performance on ImageNet, especially when the first and last layers are quantized to low-bit.
arXiv Detail & Related papers (2021-09-09T11:45:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.