Fugu-MT 論文翻訳(概要): Leveraging Pretrained Language Models as Energy Functions for Glauber Dynamics Text Diffusion

論文の概要: Leveraging Pretrained Language Models as Energy Functions for Glauber Dynamics Text Diffusion

arxiv url: http://arxiv.org/abs/2605.04291v1
Date: Tue, 05 May 2026 20:51:03 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-07 18:41:07.542603
Title: Leveraging Pretrained Language Models as Energy Functions for Glauber Dynamics Text Diffusion
Title（参考訳）: グラウバーダイナミクステキスト拡散のためのエネルギー関数としての事前学習言語モデルの活用
Authors: Tarun Kathuria, Sachin Kumar,
Abstract要約: 統計物理学からのグラウバーダイナミクスを用いた離散拡散に基づく言語モデルを提案する。我々の主な洞察は、一様遷移カーネルを前処理としてグラウバー力学を用いて離散状態空間拡散モデルを訓練する代わりに、事前訓練された因果/マスケッド言語モデルに基づいてエネルギー関数'を設定できるということである。
参考スコア（独自算出の注目度）: 4.990506857330584
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a discrete diffusion-based language model using Glauber dynamics from statistical physics. Our main insight is that instead of trying to train a discrete state space diffusion model using Glauber dynamics with a uniform transition kernel as the forward process, one can set up an ``energy function'' based on pretrained causal/masked language models. When viewed as the stationary distribution, this energy function allows us to significantly improve the quality of the generated text. Incorporating UL2 as the pretrained model into our diffusion pipeline, we outperform prior diffusion based LMs and perform competitively with autoregressive models of comparable model sizes. Furthermore, our models are competitive with or outperform prior diffusion models and GPT-2 style auto-regressive models on zero-shot common sense reasoning tasks as well as planning and search tasks like Sudoku and Zebra puzzles.
Abstract（参考訳）: 統計物理学からのグラウバーダイナミクスを用いた離散拡散に基づく言語モデルを提案する。我々の主な洞察は、一様遷移カーネルを前処理としてグラウバー力学を用いて離散状態空間拡散モデルを訓練する代わりに、事前訓練された因果/マスケッド言語モデルに基づいて「エネルギー関数」を設定できるということである。定常分布と見なすと、このエネルギー関数により、生成されたテキストの品質を大幅に向上させることができる。拡散パイプラインの事前学習モデルとしてUL2を組み込むことで、拡散前のLMよりも優れ、モデルサイズに匹敵する自己回帰モデルと競合する。さらに,従来の拡散モデルや GPT-2 スタイルの自己回帰モデルとゼロショットの常識推論タスクや,Sudoku や Zebra などの計画・探索タスクと競合する。

論文の概要: Leveraging Pretrained Language Models as Energy Functions for Glauber Dynamics Text Diffusion

関連論文リスト