Fugu-MT 論文翻訳(概要): Contrastive Distribution Matching for Amortized Sequential Monte Carlo in Discrete Diffusion

論文の概要: Contrastive Distribution Matching for Amortized Sequential Monte Carlo in Discrete Diffusion

arxiv url: http://arxiv.org/abs/2605.23346v1
Date: Fri, 22 May 2026 08:06:52 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-25 17:29:20.25609
Title: Contrastive Distribution Matching for Amortized Sequential Monte Carlo in Discrete Diffusion
Title（参考訳）: 離散拡散における直列モンテカルロのコントラスト分布マッチング
Authors: Jaihoon Kim, Taehoon Yoon, Prin Phunyaphibarn, Seungjun Kim, Morteza Mardani, Minhyuk Sung,
Abstract要約: 本稿では,SMC推論のコストを正および負のサンプルを用いてパラメータ化されたツイスト関数を学習することにより,SMC推論のコストを補正する新しいフレームワークであるContrastive Distribution Matching (CDM)を紹介する。実際、学習したツイスト関数の評価は、ベースモデルの1つの前方通過と比較して5%未満の計算オーバーヘッドを発生させる。我々は,有毒なテキスト生成,DNA配列設計,タンパク質設計性,拡散言語モデルアライメントなど,多岐にわたるアプローチの有効性と汎用性を検証した。
参考スコア（独自算出の注目度）: 33.94204857658877
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Discrete diffusion models have emerged as powerful frameworks for generating structured categorical data. However, efficiently sampling from reward-tilted distributions remains a fundamental challenge. While Twisted Sequential Monte Carlo (SMC) offers asymptotic exactness for this task, estimating the optimal twist function in discrete state spaces necessitates costly Monte Carlo approximations, resulting a severe computational bottleneck at inference. To overcome this limitation, we introduce Contrastive Distribution Matching (CDM), a novel framework that amortizes the cost of SMC inference by learning a parameterized twist function via positive and negative samples. For efficient training, we reformulate the gradient estimator to leverage the closed-form forward kernels of discrete diffusion models. In practice, evaluating our learned twist function incurs less than 5% additional computational overhead compared to a single forward pass of the base model. Through extensive empirical evaluations, we demonstrate that CDM consistently outperforms existing baselines under matched wall-clock time. We validate the effectiveness and versatility of our approach across a diverse range of applications, including toxic text generation, regulatory DNA sequence design, protein designability, and diffusion large language model alignment.
Abstract（参考訳）: 離散拡散モデルは、構造化カテゴリーデータを生成するための強力なフレームワークとして登場した。しかし、報酬型分布からの効率的なサンプリングは依然として根本的な課題である。 Twisted Sequential Monte Carlo (SMC) はこのタスクに対して漸近的正確性を提供するが、離散状態空間における最適ツイスト関数を推定するにはコストのかかるモンテカルロ近似が必要である。この制限を克服するために、正と負のサンプルを用いてパラメータ化されたツイスト関数を学習することにより、SMC推論のコストを抑える新しいフレームワークであるContrastive Distribution Matching (CDM)を導入する。効率的なトレーニングのために、離散拡散モデルの閉形式前方核を利用するために勾配推定器を再構成する。実際、学習したツイスト関数の評価は、ベースモデルの1つの前方通過と比較して5%未満の計算オーバーヘッドを発生させる。実験により,CDMは壁面時間と一致した時間において,既存のベースラインを一貫して上回ることを示す。我々は,有毒なテキスト生成,DNA配列設計,タンパク質設計性,拡散言語モデルアライメントなど,多岐にわたるアプローチの有効性と汎用性を検証した。

論文の概要: Contrastive Distribution Matching for Amortized Sequential Monte Carlo in Discrete Diffusion

関連論文リスト