Optimizing Distributional Geometry Alignment with Optimal Transport for Generative Dataset Distillation
- URL: http://arxiv.org/abs/2512.00308v1
- Date: Sat, 29 Nov 2025 04:04:05 GMT
- Title: Optimizing Distributional Geometry Alignment with Optimal Transport for Generative Dataset Distillation
- Authors: Xiao Cui, Yulei Qin, Wengang Zhou, Hongsheng Li, Houqiang Li,
- Abstract summary: We reformulate dataset distillation as an Optimal Transport (OT) distance minimization problem.<n>OT offers a geometrically faithful framework for distribution matching.<n>Our method consistently outperforms state-of-the-art approaches in an efficient manner.
- Score: 109.13471554184554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dataset distillation seeks to synthesize a compact distilled dataset, enabling models trained on it to achieve performance comparable to models trained on the full dataset. Recent methods for large-scale datasets focus on matching global distributional statistics (e.g., mean and variance), but overlook critical instance-level characteristics and intraclass variations, leading to suboptimal generalization. We address this limitation by reformulating dataset distillation as an Optimal Transport (OT) distance minimization problem, enabling fine-grained alignment at both global and instance levels throughout the pipeline. OT offers a geometrically faithful framework for distribution matching. It effectively preserves local modes, intra-class patterns, and fine-grained variations that characterize the geometry of complex, high-dimensional distributions. Our method comprises three components tailored for preserving distributional geometry: (1) OT-guided diffusion sampling, which aligns latent distributions of real and distilled images; (2) label-image-aligned soft relabeling, which adapts label distributions based on the complexity of distilled image distributions; and (3) OT-based logit matching, which aligns the output of student models with soft-label distributions. Extensive experiments across diverse architectures and large-scale datasets demonstrate that our method consistently outperforms state-of-the-art approaches in an efficient manner, achieving at least 4% accuracy improvement under IPC=10 settings for each architecture on ImageNet-1K.
Related papers
- GeoDM: Geometry-aware Distribution Matching for Dataset Distillation [5.993128231927707]
We propose a geometry-aware distribution-matching framework, called textbfGeoDM.<n>To adapt to the underlying data geometry, we introduce learnable curvature and weight parameters for three kinds of geometries.<n>Our theoretical analysis shows that the geometry-aware distribution matching in a product space yields a smaller generalization error bound than the Euclidean counterparts.
arXiv Detail & Related papers (2025-12-09T07:31:57Z) - Dataset Distillation with Probabilistic Latent Features [9.318549327568695]
A compact set of synthetic data can effectively replace the original dataset in downstream classification tasks.<n>We propose a novel approach that models the joint distribution of latent features.<n>Our method achieves state-of-the-art cross architecture performance across a range of backbone architectures.
arXiv Detail & Related papers (2025-05-10T13:53:49Z) - Dataset Distillation of 3D Point Clouds via Distribution Matching [10.294595110092407]
We propose a distribution matching-based distillation framework for 3D point clouds.<n>To address the semantic misalignment caused by unordered indexing of points, we introduce a Semantically Aligned Distribution Matching loss.<n>To address the rotation variation, we jointly learn the optimal rotation angles while updating the synthetic dataset to better align with the original feature distribution.
arXiv Detail & Related papers (2025-03-28T05:15:22Z) - Efficient Dataset Distillation via Diffusion-Driven Patch Selection for Improved Generalization [34.53986517177061]
We propose a novel framework to existing diffusion-based distillation methods, leveraging diffusion models for selection rather than generation.<n>Our method starts by predicting noise generated by the diffusion model based on input images and text prompts, then calculates the corresponding loss for each pair.<n>This streamlined framework enables a single-step distillation process, and extensive experiments demonstrate that our approach outperforms state-of-the-art methods across various metrics.
arXiv Detail & Related papers (2024-12-13T08:34:46Z) - DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation [23.02066055996762]
We propose a new Diversity-driven EarlyLate Training (DELT) scheme to enhance the diversity of images in batch-to-global matching.<n>Our approach is conceptually simple yet effective, it partitions predefined IPC samples into smaller subtasks and employs local optimizations.<n>Our approach outperforms the previous state-of-the-art by 2$sim$5% on average across different datasets and IPCs (images per class)
arXiv Detail & Related papers (2024-11-29T18:59:46Z) - Distribution-Aware Data Expansion with Diffusion Models [55.979857976023695]
We propose DistDiff, a training-free data expansion framework based on the distribution-aware diffusion model.
DistDiff consistently enhances accuracy across a diverse range of datasets compared to models trained solely on original data.
arXiv Detail & Related papers (2024-03-11T14:07:53Z) - RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based
Self-Supervised Pre-Training [58.07391711548269]
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
arXiv Detail & Related papers (2023-03-23T17:59:02Z) - Neighborhood Gradient Clustering: An Efficient Decentralized Learning
Method for Non-IID Data Distributions [5.340730281227837]
The current state-of-the-art decentralized algorithms mostly assume the data distributions to be Independent and Identically Distributed.
We propose textitNeighborhood Gradient Clustering (NGC), a novel decentralized learning algorithm that modifies the local gradients of each agent using self- and cross-gradient information.
arXiv Detail & Related papers (2022-09-28T19:28:54Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.