Fugu-MT 論文翻訳(概要): WaDi: Weight Direction-aware Distillation for One-step Image Synthesis

論文の概要: WaDi: Weight Direction-aware Distillation for One-step Image Synthesis

arxiv url: http://arxiv.org/abs/2603.08258v1
Date: Mon, 09 Mar 2026 11:27:28 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:15.838401
Title: WaDi: Weight Direction-aware Distillation for One-step Image Synthesis
Title（参考訳）: WaDi:一段階画像合成のための軽量方向認識蒸留
Authors: Lei Wang, Yang Cheng, Senmao Li, Ge Wu, Yaxing Wang, Jian Yang,
Abstract要約: 最近の研究は、多段階拡散を1段階発生器に蒸留することで推論を加速している。我々は,一段階の生徒と多段階の教師のU-Net/DiT体重変化を分析した。本稿では, 1段階拡散蒸留に適したパラメータ効率の高いアダプタであるLoRaD(LoRaD)を提案する。
参考スコア（独自算出の注目度）: 25.65170574291749
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the impressive performance of diffusion models such as Stable Diffusion (SD) in image generation, their slow inference limits practical deployment. Recent works accelerate inference by distilling multi-step diffusion into one-step generators. To better understand the distillation mechanism, we analyze U-Net/DiT weight changes between one-step students and their multi-step teacher counterparts. Our analysis reveals that changes in weight direction significantly exceed those in weight norm, highlighting it as the key factor during distillation. Motivated by this insight, we propose the Low-rank Rotation of weight Direction (LoRaD), a parameter-efficient adapter tailored to one-step diffusion distillation. LoRaD is designed to model these structured directional changes using learnable low-rank rotation matrices. We further integrate LoRaD into Variational Score Distillation (VSD), resulting in Weight Direction-aware Distillation (WaDi)-a novel one-step distillation framework. WaDi achieves state-of-the-art FID scores on COCO 2014 and COCO 2017 while using only approximately 10% of the trainable parameters of the U-Net/DiT. Furthermore, the distilled one-step model demonstrates strong versatility and scalability, generalizing well to various downstream tasks such as controllable generation, relation inversion, and high-resolution synthesis.
Abstract（参考訳）: 画像生成における安定拡散(SD)のような拡散モデルの顕著な性能にもかかわらず、その遅い推論は実用的な展開を制限する。最近の研究は、多段階拡散を1段階発生器に蒸留することで推論を加速している。蒸留メカニズムをよりよく理解するために,一段階の生徒と多段階の教師のU-Net/DiT重量変化を分析した。分析の結果, 重量方向の変化は重量標準値よりも有意に大きく, 蒸留における重要な要因として強調された。この知見により, 1段階拡散蒸留に適したパラメータ効率の高いアダプタであるLoRaD (Lo-rank Rotation of weight Direction) を提案する。 LoRaDは、学習可能なローランク回転行列を用いて、これらの構造化方向変化をモデル化するように設計されている。さらに,我々はLoRaDをVSDに統合し,新しい一段階蒸留フレームワークであるウェイディ (WaDi) の重量方向認識蒸留を行った。 WaDiは、U-Net/DiTのトレーニング可能なパラメータの約10%を使用しながら、COCO 2014とCOCO 2017で最先端のFIDスコアを達成している。さらに, 蒸留ワンステップモデルでは, 高い汎用性と拡張性を示し, 制御可能生成, 関係逆転, 高分解能合成など, 様々な下流タスクを一般化する。

論文の概要: WaDi: Weight Direction-aware Distillation for One-step Image Synthesis

関連論文リスト