Fugu-MT 論文翻訳(概要): Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation

論文の概要: Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation

arxiv url: http://arxiv.org/abs/2510.08994v1
Date: Fri, 10 Oct 2025 04:30:45 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-14 00:38:48.11116
Title: Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation
Title（参考訳）: 自動回帰テキスト画像生成の高速化のための投機的ヤコビ復号化復号法
Authors: Yao Teng, Fuyun Wang, Xian Liu, Zhekai Chen, Han Shi, Yu Wang, Zhenguo Li, Weiyang Liu, Difan Zou, Xihui Liu,
Abstract要約: Speculative Jacobi-Denoising Decoding (SJD2) は、自動回帰モデルでパラレルトークン生成を可能にするために、デノナイズプロセスをJacobiに組み込むフレームワークである。提案手法では,事前学習した自己回帰モデルに対して,ノイズ・摂動トークンの埋め込みを受理できる次クリーンな予測パラダイムを導入する。
参考スコア（独自算出の注目度）: 110.28291466364784
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As a new paradigm of visual content generation, autoregressive text-to-image models suffer from slow inference due to their sequential token-by-token decoding process, often requiring thousands of model forward passes to generate a single image. To address this inefficiency, we propose Speculative Jacobi-Denoising Decoding (SJD2), a framework that incorporates the denoising process into Jacobi iterations to enable parallel token generation in autoregressive models. Our method introduces a next-clean-token prediction paradigm that enables the pre-trained autoregressive models to accept noise-perturbed token embeddings and predict the next clean tokens through low-cost fine-tuning. This denoising paradigm guides the model towards more stable Jacobi trajectories. During inference, our method initializes token sequences with Gaussian noise and performs iterative next-clean-token-prediction in the embedding space. We employ a probabilistic criterion to verify and accept multiple tokens in parallel, and refine the unaccepted tokens for the next iteration with the denoising trajectory. Experiments show that our method can accelerate generation by reducing model forward passes while maintaining the visual quality of generated images.
Abstract（参考訳）: ビジュアルコンテンツ生成の新しいパラダイムとして、自動回帰的テキスト・ツー・イメージモデルは、シーケンシャルなトークン・バイ・トーケンデコーディングプロセスのために遅い推論に苦しむ。この非効率性に対処するため、自動回帰モデルにおける並列トークン生成を可能にするために、このデノケーションプロセスをJacobiイテレーションに組み込むフレームワークであるSpeculative Jacobi-Denoising Decoding (SJD2)を提案する。提案手法では,事前学習した自己回帰モデルに対して,低コストの微調整により,ノイズ摂動トークンの埋め込みを受理し,次のクリーントークンを予測できる次世代クリーントークン予測パラダイムを提案する。この妄想的パラダイムは、より安定したヤコビ軌道に向けてモデルを導く。提案手法は,ガウス雑音でトークン列を初期化し,埋め込み空間において反復的な次クリーン-トケン予測を行う。我々は,複数のトークンを並列に検証,受け入れる確率的基準を用いて,次のイテレーションで許容されないトークンを発音軌道で洗練する。実験により,生成した画像の視覚的品質を維持しながら,モデル転送パスを低減し,生成を高速化できることが示された。

論文の概要: Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation

関連論文リスト