Fugu-MT 論文翻訳(概要): Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models

論文の概要: Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models

arxiv url: http://arxiv.org/abs/2604.00375v1
Date: Wed, 01 Apr 2026 02:01:30 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-02 16:44:31.787624
Title: Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models
Title（参考訳）: 局所的信頼とグローバル的汚点:拡散言語モデルの品質探索ジレンマ
Authors: Liancheng Fang, Aiwei Liu, Henry Peng Zou, Yankai Chen, Enze Ma, Leyi Pan, Chunyu Miao, Wei-Chieh Huang, Xue Liu, Philip S. Yu,
Abstract要約: 低信頼度再マッシングは、誘導配列分布のエントロピーを制約しながら、品質のプロキシを改善することを示す。我々は,デコード時に,この分布をほぼ対象とする簡易なインディペンデント・ハスティングス・サンプリング器を開発した。
参考スコア（独自算出の注目度）: 52.61023005303122
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion large language models (dLLMs) theoretically permit token decoding in arbitrary order, a flexibility that could enable richer exploration of reasoning paths than autoregressive (AR) LLMs. In practice, however, random-order decoding often hurts generation quality. To mitigate this, low-confidence remasking improves single-sample quality (e.g., Pass@$1$) by prioritizing confident tokens, but it also suppresses exploration and limits multi-sample gains (e.g., Pass@$k$), creating a fundamental quality--exploration dilemma. In this paper, we provide a unified explanation of this dilemma. We show that low-confidence remasking improves a myopic proxy for quality while provably constraining the entropy of the induced sequence distribution. To overcome this limitation, we characterize the optimal distribution that explicitly balances quality and exploration, and develop a simple Independent Metropolis--Hastings sampler that approximately targets this distribution during decoding. Experiments across a range of reasoning benchmarks including MATH500, AIME24/25, HumanEval, and MBPP show that our approach yields better exploration-quality tradeoff than both random and low-confidence remasking.
Abstract（参考訳）: 拡散大言語モデル (dLLMs) は、任意の順序でトークンの復号化を理論的に許可する。しかし実際には、ランダム順序の復号化はしばしば生成品質を損なう。これを軽減するために、信頼性の高いトークンを優先順位付けすることで、シングルサンプルの品質(例えば、Pass@$1$)が向上するが、探索を抑え、マルチサンプルゲイン(例えば、Pass@$k$)を制限し、基本的な品質-探索ジレンマを生成する。本稿では,このジレンマを統一的に説明する。低信頼度再マッシングは、誘導配列分布のエントロピーを確実に制限しつつ、品質のミオピックプロキシを改善することを示す。この制限を克服するために、我々は、品質と探索のバランスを明確にする最適な分布を特徴付け、デコード時にこの分布をほぼ対象とする簡易な独立メトロポリス-ハスティングスサンプリングを開発した。 MATH500、AIME24/25、HumanEval、MBPPなど、さまざまな推論ベンチマークによる実験により、我々の手法はランダムおよび低信頼の両方のリマキングよりも、より良い探索品質のトレードオフをもたらすことが示された。

論文の概要: Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models

関連論文リスト