TGDD: Trajectory Guided Dataset Distillation with Balanced Distribution
- URL: http://arxiv.org/abs/2512.02469v1
- Date: Tue, 02 Dec 2025 07:00:07 GMT
- Title: TGDD: Trajectory Guided Dataset Distillation with Balanced Distribution
- Authors: Fengli Ran, Xiao Pu, Bo Liu, Xiuli Bi, Bin Xiao,
- Abstract summary: We propose Trajectory Guided dataset Distillation (TGDD), which reformulates distribution matching as a dynamic alignment process.<n>At each training stage, TGDD captures evolving semantics by aligning the feature distribution between the synthetic and original dataset.<n>Experiments on ten datasets demonstrate that TGDD achieves state-of-the-art performance, notably a 5.0% accuracy gain on high-resolution benchmarks.
- Score: 22.720901808326122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dataset distillation compresses large datasets into compact synthetic ones to reduce storage and computational costs. Among various approaches, distribution matching (DM)-based methods have attracted attention for their high efficiency. However, they often overlook the evolution of feature representations during training, which limits the expressiveness of synthetic data and weakens downstream performance. To address this issue, we propose Trajectory Guided Dataset Distillation (TGDD), which reformulates distribution matching as a dynamic alignment process along the model's training trajectory. At each training stage, TGDD captures evolving semantics by aligning the feature distribution between the synthetic and original dataset. Meanwhile, it introduces a distribution constraint regularization to reduce class overlap. This design helps synthetic data preserve both semantic diversity and representativeness, improving performance in downstream tasks. Without additional optimization overhead, TGDD achieves a favorable balance between performance and efficiency. Experiments on ten datasets demonstrate that TGDD achieves state-of-the-art performance, notably a 5.0% accuracy gain on high-resolution benchmarks.
Related papers
- From Fewer Samples to Fewer Bits: Reframing Dataset Distillation as Joint Optimization of Precision and Compactness [6.073185086959359]
We propose a unified framework that jointly optimize dataset compactness and precision under fixed bit budgets.<n>QuADD integrates a differentiable quantization module within the distillation loop, enabling end-to-end co-optimization of synthetic samples and quantization parameters.<n>Our framework supports both uniform and adaptive non-uniform quantization, where the latter learns quantization levels from data to represent information-dense regions better.
arXiv Detail & Related papers (2026-03-02T21:46:10Z) - Towards Principled Dataset Distillation: A Spectral Distribution Perspective [29.986767000752753]
We propose Class-Aware Spectral Distribution Matching (MCSD), which reformulates distribution alignment via the spectrum of a well-behaved kernel function.<n>On CIFAR-10-LT, with 10 images per class, CSDM achieves a 14.0% improvement over state-of-the-art DD methods, with only a 5.7% performance drop when the number of images in tail classes decreases.
arXiv Detail & Related papers (2026-03-02T10:26:49Z) - Optimizing Distributional Geometry Alignment with Optimal Transport for Generative Dataset Distillation [109.13471554184554]
We reformulate dataset distillation as an Optimal Transport (OT) distance minimization problem.<n>OT offers a geometrically faithful framework for distribution matching.<n>Our method consistently outperforms state-of-the-art approaches in an efficient manner.
arXiv Detail & Related papers (2025-11-29T04:04:05Z) - Hierarchical Features Matter: A Deep Exploration of Progressive Parameterization Method for Dataset Distillation [44.03611131165989]
We propose a novel generative parameterization method dubbed Hierarchical generative Distillation (H-PD)<n>The proposed H-PD achieves a significant performance improvement under various settings with equivalent time consumption.<n>It even surpasses current generative distillation using diffusion models under extreme compression ratios IPC=1 and IPC=10.
arXiv Detail & Related papers (2024-06-09T09:15:54Z) - Importance-Aware Adaptive Dataset Distillation [53.79746115426363]
Development of deep learning models is enabled by the availability of large-scale datasets.
dataset distillation aims to synthesize a compact dataset that retains the essential information from the large original dataset.
We propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance.
arXiv Detail & Related papers (2024-01-29T03:29:39Z) - M3D: Dataset Condensation by Minimizing Maximum Mean Discrepancy [26.227927019615446]
Training state-of-the-art (SOTA) deep models often requires extensive data, resulting in substantial training and storage costs.
dataset condensation has been developed to learn a small synthetic set that preserves essential information from the original large-scale dataset.
We present a novel DM-based method named M3D for dataset condensation by Minimizing the Maximum Mean Discrepancy.
arXiv Detail & Related papers (2023-12-26T07:45:32Z) - Dataset Distillation via the Wasserstein Metric [34.06251608504682]
We introduce WMDD (Wasserstein Metric-based dataset Distillation), a straightforward yet powerful method that employs the Wasserstein metric to enhance distribution matching.<n>Our experiments demonstrate WMDD's effectiveness and adaptability, highlighting its potential for advancing machine learning applications at scale.
arXiv Detail & Related papers (2023-11-30T13:15:28Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Minimizing the Accumulated Trajectory Error to Improve Dataset
Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory.
Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z) - Improving GANs with A Dynamic Discriminator [106.54552336711997]
We argue that a discriminator with an on-the-fly adjustment on its capacity can better accommodate such a time-varying task.
A comprehensive empirical study confirms that the proposed training strategy, termed as DynamicD, improves the synthesis performance without incurring any additional cost or training objectives.
arXiv Detail & Related papers (2022-09-20T17:57:33Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.