Fugu-MT 論文翻訳(概要): Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling

論文の概要: Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling

arxiv url: http://arxiv.org/abs/2509.01624v1
Date: Mon, 01 Sep 2025 17:09:22 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-04 15:17:03.793306
Title: Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling
Title（参考訳）: Q-Sched:量子化対応スケジューリングによるFew-Step拡散モデルの境界を押し上げる
Authors: Natalia Frumkin, Diana Marculescu,
Abstract要約: 本稿では,モデル重みよりも拡散モデルスケジューラを改良した,ポストトレーニング量子化のための新しいパラダイムであるQ-Schedを紹介する。 Q-Schedはモデルサイズの4倍の精度で完全精度を達成する。 80,000以上のアノテーションを持つ大規模なユーザ調査では、FLUX.1[schnell]とSDXL-Turboの両方でQ-Schedの有効性が確認されている。
参考スコア（独自算出の注目度）: 17.912877295127355
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Text-to-image diffusion models are computationally intensive, often requiring dozens of forward passes through large transformer backbones. For instance, Stable Diffusion XL generates high-quality images with 50 evaluations of a 2.6B-parameter model, an expensive process even for a single batch. Few-step diffusion models reduce this cost to 2-8 denoising steps but still depend on large, uncompressed U-Net or diffusion transformer backbones, which are often too costly for full-precision inference without datacenter GPUs. These requirements also limit existing post-training quantization methods that rely on full-precision calibration. We introduce Q-Sched, a new paradigm for post-training quantization that modifies the diffusion model scheduler rather than model weights. By adjusting the few-step sampling trajectory, Q-Sched achieves full-precision accuracy with a 4x reduction in model size. To learn quantization-aware pre-conditioning coefficients, we propose the JAQ loss, which combines text-image compatibility with an image quality metric for fine-grained optimization. JAQ is reference-free and requires only a handful of calibration prompts, avoiding full-precision inference during calibration. Q-Sched delivers substantial gains: a 15.5% FID improvement over the FP16 4-step Latent Consistency Model and a 16.6% improvement over the FP16 8-step Phased Consistency Model, showing that quantization and few-step distillation are complementary for high-fidelity generation. A large-scale user study with more than 80,000 annotations further confirms Q-Sched's effectiveness on both FLUX.1[schnell] and SDXL-Turbo.
Abstract（参考訳）: テキスト・ツー・イメージ拡散モデルは計算集約的であり、大きなトランスフォーマーのバックボーンを通る数十のフォワードパスを必要とすることが多い。例えば、安定拡散XLは2.6Bパラメータモデルの50評価で高品質な画像を生成する。数ステップの拡散モデルは、このコストを2～8ステップに削減するが、大きな、圧縮されていないU-Netまたは拡散トランスフォーマーバックボーンに依存している。これらの要求はまた、完全精度のキャリブレーションに依存する既存の訓練後の量子化法を制限する。本稿では,モデル重みよりも拡散モデルスケジューラを改良した,ポストトレーニング量子化のための新しいパラダイムであるQ-Schedを紹介する。数ステップのサンプリング軌道を調整することで、Q-Schedはモデルサイズの4倍の精度で完全精度を実現する。量子化対応プリコンディショニング係数を学習するために,テキスト画像の互換性と画質指標を組み合わせたJAQ損失を提案する。 JAQは参照なしで、キャリブレーションのプロンプトをほんの数回だけ必要としており、キャリブレーション中に完全な推論を避ける。 Q-Schedは、FP16の4段階の遅延一貫性モデルに対する15.5%のFID改善と、FP16の8段階の位相一貫性モデルに対する16.6%の改善により、量子化と数段階の蒸留が高忠実度生成の補完となることが示されている。 80,000以上のアノテーションを持つ大規模なユーザ調査では、FLUX.1[schnell]とSDXL-Turboの両方でQ-Schedの有効性が確認されている。

論文の概要: Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling

関連論文リスト