Fugu-MT 論文翻訳(概要): Post Training Quantization for Efficient Dataset Condensation

論文の概要: Post Training Quantization for Efficient Dataset Condensation

arxiv url: http://arxiv.org/abs/2603.13346v1
Date: Sat, 07 Mar 2026 09:47:24 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-17 18:28:57.787722
Title: Post Training Quantization for Efficient Dataset Condensation
Title（参考訳）: 効率的なデータセット凝縮のためのポストトレーニング量子化
Authors: Linh-Tam Tran, Sung-Ho Bae,
Abstract要約: 本稿では,情報損失を最小限に抑えた局所化量子化を実現するための,新しいEmphpatchベースのポストトレーニング量子化手法を提案する。本手法は,様々な直流法により生成された合成画像に適用可能なプラグアンドプレイフレームワークである。
参考スコア（独自算出の注目度）: 7.9166691228067565
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dataset Condensation (DC) distills knowledge from large datasets into smaller ones, accelerating training and reducing storage requirements. However, despite notable progress, prior methods have largely overlooked the potential of quantization for further reducing storage costs. In this paper, we take the first step to explore post-training quantization in dataset condensation, demonstrating its effectiveness in reducing storage size while maintaining representation quality without requiring expensive training cost. However, we find that at extremely low bit-widths (e.g., 2-bit), conventional quantization leads to substantial degradation in representation quality, negatively impacting the networks trained on these data. To address this, we propose a novel \emph{patch-based post-training quantization} approach that ensures localized quantization with minimal loss of information. To reduce the overhead of quantization parameters, especially for small patch sizes, we employ quantization-aware clustering to identify similar patches and subsequently aggregate them for efficient quantization. Furthermore, we introduce a refinement module to align the distribution between original images and their dequantized counterparts, compensating for quantization errors. Our method is a plug-and-play framework that can be applied to synthetic images generated by various DC methods. Extensive experiments across diverse benchmarks including CIFAR-10/100, Tiny ImageNet, and ImageNet subsets demonstrate that our method consistently outperforms prior works under the same storage constraints. Notably, our method nearly \textbf{doubles the test accuracy} of existing methods at extreme compression regimes (e.g., 26.0\% $\rightarrow$ 54.1\% for DM at IPC=1), while operating directly on 2-bit images without additional distillation.
Abstract（参考訳）: Dataset Condensation (DC)は、大規模なデータセットからより小さなデータセットに知識を蒸留し、トレーニングを加速し、ストレージ要件を低減します。しかし、顕著な進歩にもかかわらず、以前の手法はストレージコストをさらに削減するための量子化の可能性をほとんど見落としていた。本稿では,データセットの凝縮における後トレーニング量子化の第一段階として,高コストのトレーニングコストを必要とせず,表現品質を維持しつつ,記憶容量を削減できることを実証する。しかし、非常に低ビット幅(例: 2ビット)では、従来の量子化は表現品質を著しく低下させ、これらのデータに基づいてトレーニングされたネットワークに悪影響を及ぼす。そこで本研究では,情報損失を最小限に抑えた局所化量子化を実現するための,新しいemph{patch-based post-training Quantization}アプローチを提案する。量子化パラメータのオーバーヘッドを低減するため、特に小さなパッチサイズでは、量子化を意識したクラスタリングを用いて類似のパッチを識別し、それらを効率的な量子化のために集約する。さらに,原画像とその復号化画像間の分布を調整し,量子化誤差を補償する改良モジュールを導入する。本手法は,様々な直流法により生成された合成画像に適用可能なプラグイン・アンド・プレイ・フレームワークである。 CIFAR-10/100、Tiny ImageNet、ImageNetサブセットなど、さまざまなベンチマークの広範な実験により、我々のメソッドは、同じストレージ制約下での先行処理を一貫して上回ることを示した。 IPC=1 で DM に対して 26.0 % $\rightarrow$ 54.1 % である) で既存の手法の試験精度をほぼ2倍にし, 追加蒸留なしで2ビット画像を直接操作する。

論文の概要: Post Training Quantization for Efficient Dataset Condensation

関連論文リスト