Fugu-MT 論文翻訳(概要): Towards Accurate Binarization of Diffusion Model

論文の概要: Towards Accurate Binarization of Diffusion Model

arxiv url: http://arxiv.org/abs/2404.05662v2
Date: Sat, 25 May 2024 00:34:17 GMT
ステータス: 翻訳完了
システム内更新日: 2024-05-29 06:07:03.640415
Title: Towards Accurate Binarization of Diffusion Model
Title（参考訳）: 拡散モデルの正確なバイナリ化に向けて
Authors: Xingyu Zheng, Haotong Qin, Xudong Ma, Mingyuan Zhang, Haojie Hao, Jiakai Wang, Zixiang Zhao, Jinyang Guo, Xianglong Liu,
Abstract要約: 本稿では,DMの新しい量子化学習手法であるBinaryDMを提案する。 1.1ビットの重みと4ビットのアクティベーション(W1.1A4)により、BinaryDMは7.11 FIDまで低くなり、性能が低下する。拡散モデルの最初の二項化法として、W1.1A4 BinaryDMは9.3倍のOPと24.8倍のモデルサイズ保存を実現している。
参考スコア（独自算出の注目度）: 39.83092545597569
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the advancement of diffusion models (DMs) and the substantially increased computational requirements, quantization emerges as a practical solution to obtain compact and efficient low-bit DMs. However, the highly discrete representation leads to severe accuracy degradation, hindering the quantization of diffusion models to ultra-low bit-widths. This paper proposes a novel quantization-aware training approach for DMs, namely BinaryDM. The proposed method pushes DMs' weights toward accurate and efficient binarization, considering the representation and computation properties. From the representation perspective, we present a Learnable Multi-basis Binarizer (LMB) to recover the representations generated by the binarized DM. The LMB enhances detailed information through the flexible combination of dual binary bases while applying to parameter-sparse locations of DM architectures to achieve minor burdens. From the optimization perspective, a Low-rank Representation Mimicking (LRM) is applied to assist the optimization of binarized DMs. The LRM mimics the representations of full-precision DMs in low-rank space, alleviating the direction ambiguity of the optimization process caused by fine-grained alignment. Moreover, a quick progressive warm-up is applied to BinaryDM, avoiding convergence difficulties by layerwisely progressive quantization at the beginning of training. Comprehensive experiments demonstrate that BinaryDM achieves significant accuracy and efficiency gains compared to SOTA quantization methods of DMs under ultra-low bit-widths. With 1.1-bit weight and 4-bit activation (W1.1A4), BinaryDM achieves as low as 7.11 FID and saves the performance from collapse (baseline FID 39.69). As the first binarization method for diffusion models, W1.1A4 BinaryDM achieves impressive 9.3 times OPs and 24.8 times model size savings, showcasing its substantial potential for edge deployment.
Abstract（参考訳）: 拡散モデル(DM)の進歩と計算要求の大幅な増大により、量子化はコンパクトで効率的な低ビットDMを得るための実用的な解決策として現れる。しかし、非常に離散的な表現は精度の低下を招き、拡散モデルの超低ビット幅への量子化を妨げる。本稿では,DMの新しい量子化学習手法であるBinaryDMを提案する。提案手法は,表現特性と計算特性を考慮して,DMの重み付けを高精度かつ効率的にバイナライズする。表現の観点からは、二項化DMによって生成された表現を復元するLearable Multi-Basis Binarizer (LMB)を提案する。 LMBは、DMアーキテクチャのパラメータスパースな位置に適用しながら、2つのバイナリベースをフレキシブルに組み合わせることで、詳細な情報を強化する。最適化の観点からは、二項化DMの最適化を支援するために低ランク表現ミミシング(LRM)を適用する。 LRMは低ランク空間における完全精度DMの表現を模倣し、微粒なアライメントに起因する最適化プロセスの方向性の曖昧さを軽減する。さらに、BinaryDMに高速なプログレッシブウォームアップを適用し、トレーニング開始時の階層的にプログレッシブ量子化による収束困難を回避する。超低ビット幅におけるDMのSOTA量子化法と比較して,BinaryDMは高い精度と効率向上を達成することを示した。 1.1ビットの重みと4ビットのアクティベーション(W1.1A4)により、BinaryDMは7.11 FIDまで低くなり、破壊(ベースラインFID 39.69)から性能を低下させる。拡散モデルの最初の二項化法として、W1.1A4 BinaryDMは9.3倍のOPと24.8倍のモデルサイズを達成し、エッジ展開の可能性を示している。

論文の概要: Towards Accurate Binarization of Diffusion Model

関連論文リスト