Fugu-MT 論文翻訳(概要): Understanding Parallel Samplers in Masked Diffusion via Random Walks on Graphs

論文の概要: Understanding Parallel Samplers in Masked Diffusion via Random Walks on Graphs

arxiv url: http://arxiv.org/abs/2606.22976v1
Date: Mon, 22 Jun 2026 07:56:50 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-25 03:07:55.583077
Title: Understanding Parallel Samplers in Masked Diffusion via Random Walks on Graphs
Title（参考訳）: グラフ上のランダムウォークによる仮設拡散における並列サンプリングの理解
Authors: Vansh Bansal, Cho Cholyeon, Syamantak Kumar, Sujay Sanghavi, Purnamrita Sarkar,
Abstract要約: グラフ上のランダムウォークを検証可能なサンドボックスとして使用し、マスク拡散モデルにおける異なる並列サンプリング戦略を研究する。我々はランダムウォークのための新しい2分割サンプリング装置を開発し、配列長の対数ステップを踏むことができ、完全に訓練された状態では確実に正確である。事前訓練した OpenWebText MDM に関する最初の実験では,バイセクションスタイルのサンプルは,言語生成においても速度品質のトレードオフを改善することがわかった。
参考スコア（独自算出の注目度）: 21.909503091671297
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we propose using random walks on graphs as a verifiable sandbox to study different parallel sampling strategies in masked diffusion models (MDMs). We train an MDM on random walk samples from a fixed graph. The graph or the transition kernel is never shown to the model explicitly and plays the role of latent structure in the sequences, albeit one that is controllable and can be used for quantitative evaluation. Thus, this framework enjoys a Sudoku-like validity check: verifying that an output is a valid walk and estimating the Markov kernel from the walks to measure distribution fidelity. Using simple graphs, we theoretically prove that parallel unmasking via widely used scores like lowest entropy is not uniformly better than a random parallel sampler; the performance critically depends on the structure of the underlying graph. We develop a new bisection sampler for random walks, which takes logarithmic steps in the sequence length and is provably exact under perfect training. Experiments on various graph walk tasks show that different parallel samplers are better for different graphs even in practice. Our initial experiments on a pretrained OpenWebText MDM show that the bisection-style samplers improve speed-quality tradeoffs even for language generation. Together, these results position graph random walks as a mechanistic benchmark for diagnosing and designing parallel samplers for masked diffusion models.
Abstract（参考訳）: 本稿では,グラフ上のランダムウォークを検証可能なサンドボックスとして使用し,マスク拡散モデル(MDM)の異なる並列サンプリング戦略について検討する。固定グラフからランダムウォークサンプルを用いてMDMを訓練する。グラフや遷移カーネルはモデルに明示的に表示されず、シーケンスにおいて潜在構造の役割を担っているが、制御可能であり、定量的評価に使用できる。このように、このフレームワークは、出力が有効なウォークであることを検証し、そのウォークからマルコフカーネルを推定し、分布の忠実度を測定する。単純なグラフを用いて、最小エントロピーのような広く使われているスコアによる並列アンマスキングが、ランダムな並列サンプリングよりも一様ではないことを理論的に証明する。我々はランダムウォークのための新しい2分割サンプリング装置を開発し、配列長の対数ステップを踏むことができ、完全に訓練された状態では確実に正確である。様々なグラフウォークタスクの実験では、実際には異なるグラフに対して異なる並列サンプリングがより優れていることが示されている。事前訓練した OpenWebText MDM に関する最初の実験では,バイセクションスタイルのサンプルは,言語生成においても速度品質のトレードオフを改善することがわかった。これらの結果と合わせて、マスキング拡散モデルのための並列サンプリング器の診断および設計のための力学ベンチマークとしてグラフランダムウォークが位置づけられる。

論文の概要: Understanding Parallel Samplers in Masked Diffusion via Random Walks on Graphs

関連論文リスト