Related papers: From Fewer Samples to Fewer Bits: Reframing Dataset Distillation as Joint Optimization of Precision and Compactness

From Fewer Samples to Fewer Bits: Reframing Dataset Distillation as Joint Optimization of Precision and Compactness

URL: http://arxiv.org/abs/2603.02411v1
Date: Mon, 02 Mar 2026 21:46:10 GMT
Title: From Fewer Samples to Fewer Bits: Reframing Dataset Distillation as Joint Optimization of Precision and Compactness
Authors: My H. Dinh, Aditya Sant, Akshay Malhotra, Keya Patani, Shahab Hamidi-Rad,
Abstract summary: We propose a unified framework that jointly optimize dataset compactness and precision under fixed bit budgets.<n>QuADD integrates a differentiable quantization module within the distillation loop, enabling end-to-end co-optimization of synthetic samples and quantization parameters.<n>Our framework supports both uniform and adaptive non-uniform quantization, where the latter learns quantization levels from data to represent information-dense regions better.
Score: 6.073185086959359
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Dataset Distillation (DD) compresses large datasets into compact synthetic ones that maintain training performance. However, current methods mainly target sample reduction, with limited consideration of data precision and its impact on efficiency. We propose Quantization-aware Dataset Distillation (QuADD), a unified framework that jointly optimizes dataset compactness and precision under fixed bit budgets. QuADD integrates a differentiable quantization module within the distillation loop, enabling end-to-end co-optimization of synthetic samples and quantization parameters. Guided by the rate-distortion perspective, we empirically analyze how bit allocation between sample count and precision influences learning performance. Our framework supports both uniform and adaptive non-uniform quantization, where the latter learns quantization levels from data to represent information-dense regions better. Experiments on image classification and 3GPP beam management tasks show that QuADD surpasses existing DD and post-quantized baselines in accuracy per bit, establishing a new standard for information-efficient dataset distillation.

Related papers

Utility Boundary of Dataset Distillation: Scaling and Configuration-Coverage Laws [6.172966466468818]
It is unclear under what conditions distilled data can retain the effectiveness of full datasets when the training configuration changes.<n>We propose a unified theoretical framework, termed configuration--configuration-error analysis, which reformulates major DD approaches under a common generalization-error perspective.<n>Our analysis reveals that various matching methods are interchangeable surrogates, reducing the same generalization error, clarifying why they can all achieve dataset distillation.
arXiv Detail & Related papers (2025-12-05T15:37:38Z)
Optimizing Distributional Geometry Alignment with Optimal Transport for Generative Dataset Distillation [109.13471554184554]
We reformulate dataset distillation as an Optimal Transport (OT) distance minimization problem.<n>OT offers a geometrically faithful framework for distribution matching.<n>Our method consistently outperforms state-of-the-art approaches in an efficient manner.
arXiv Detail & Related papers (2025-11-29T04:04:05Z)
Rectified Decoupled Dataset Distillation: A Closer Look for Fair and Comprehensive Evaluation [36.444254126901065]
We propose Rectified Decoupled dataset Distillation (RD$3$) to generate compact synthetic datasets.<n>RD$3$ provides a foundation for fair and reproducible comparisons in future dataset distillation research.
arXiv Detail & Related papers (2025-09-24T03:47:04Z)
Dataset Distillation as Data Compression: A Rate-Utility Perspective [31.050187201929557]
We propose a joint rate-utility optimization method for dataset distillation.<n>We parameterize synthetic samples as optimizable latent codes decoded by extremely lightweight networks.<n>We estimate the Shannon entropy of quantized latents as the rate measure and plug any existing distillation loss as the utility measure, trading them off via a Lagrange multiplier.
arXiv Detail & Related papers (2025-07-23T05:40:52Z)
DD-Ranking: Rethinking the Evaluation of Dataset Distillation [314.9621366437238]
We propose DD-Ranking, a unified evaluation framework, along with new general evaluation metrics to uncover the true performance improvements achieved by different methods.<n>By refocusing on the actual information enhancement of distilled datasets, DD-Ranking provides a more comprehensive and fair evaluation standard for future research advancements.
arXiv Detail & Related papers (2025-05-19T16:19:50Z)
Dataset Distillation as Pushforward Optimal Quantization [2.5892916589735457]
We propose a synthetic training set that achieves similar performance to training on real data, with orders of magnitude less computational requirements.<n>In particular, we link existing disentangled dataset distillation methods to the classical optimal quantization and Wasserstein barycenter problems.<n>We achieve better performance and inter-model generalization on the ImageNet-1K dataset with trivial additional computation, and SOTA performance in higher image-per-class settings.
arXiv Detail & Related papers (2025-01-13T20:41:52Z)
Generative Dataset Distillation Based on Self-knowledge Distillation [49.20086587208214]
We present a novel generative dataset distillation method that can improve the accuracy of aligning prediction logits.<n>Our approach integrates self-knowledge distillation to achieve more precise distribution matching between the synthetic and original data.<n>Our method outperforms existing state-of-the-art methods, resulting in superior distillation performance.
arXiv Detail & Related papers (2025-01-08T00:43:31Z)
Adaptive Dataset Quantization [2.0105434963031463]
We introduce a versatile framework for dataset compression, namely Adaptive dataset Quantization (ADQ)<n>We propose a novel adaptive sampling strategy through the evaluation of generated bins' representativeness score, diversity score and importance score.<n>Our method not only exhibits superior generalization capability across different architectures, but also attains state-of-the-art results, surpassing DQ by average 3% on various datasets.
arXiv Detail & Related papers (2024-12-22T07:08:29Z)
Not All Samples Should Be Utilized Equally: Towards Understanding and Improving Dataset Distillation [57.6797306341115]
We take an initial step towards understanding various matching-based DD methods from the perspective of sample difficulty.<n>We then extend the neural scaling laws of data pruning to DD to theoretically explain these matching-based methods.<n>We introduce the Sample Difficulty Correction (SDC) approach, designed to predominantly generate easier samples to achieve higher dataset quality.
arXiv Detail & Related papers (2024-08-22T15:20:32Z)
Importance-Aware Adaptive Dataset Distillation [53.79746115426363]
Development of deep learning models is enabled by the availability of large-scale datasets. dataset distillation aims to synthesize a compact dataset that retains the essential information from the large original dataset. We propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance.
arXiv Detail & Related papers (2024-01-29T03:29:39Z)
CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE) At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales. We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.