Fugu-MT 論文翻訳(概要): CANDI: Hybrid Discrete-Continuous Diffusion Models

論文の概要: CANDI: Hybrid Discrete-Continuous Diffusion Models

arxiv url: http://arxiv.org/abs/2510.22510v2
Date: Tue, 28 Oct 2025 19:55:41 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-30 13:34:45.437042
Title: CANDI: Hybrid Discrete-Continuous Diffusion Models
Title（参考訳）: CANDI: 離散連続拡散モデル
Authors: Patrick Pynadath, Jiaxin Shi, Ruqi Zhang,
Abstract要約: ノイズが離散的なデータの分解を2つのメカニズムで示す: 離散的なアイデンティティの破損と連続的なランクの劣化である。離散的・連続的な腐敗を分離するハイブリッドフレームワークであるCANDIを提案する。これは離散空間に対する連続拡散の利点を解放する。
参考スコア（独自算出の注目度）: 36.61898210733147
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While continuous diffusion has shown remarkable success in continuous domains such as image generation, its direct application to discrete data has underperformed compared to purely discrete formulations. This gap is counterintuitive, given that continuous diffusion learns score functions that enable joint evolution across multiple positions. To understand this gap, we introduce token identifiability as an analytical framework for understanding how Gaussian noise corrupts discrete data through two mechanisms: discrete identity corruption and continuous rank degradation. We reveal that these mechanisms scale differently with vocabulary size, creating a temporal dissonance: at noise levels where discrete corruption preserves enough structure for conditional learning, continuous denoising is trivial; at noise levels where continuous denoising is meaningful, discrete corruption destroys nearly all conditional structure. To solve this, we propose CANDI (Continuous ANd DIscrete diffusion), a hybrid framework that decouples discrete and continuous corruption, enabling simultaneous learning of both conditional structure and continuous geometry. We empirically validate the temporal dissonance phenomenon and demonstrate that CANDI successfully avoids it. This unlocks the benefits of continuous diffusion for discrete spaces: on controlled generation, CANDI enables classifier-based guidance with off-the-shelf classifiers through simple gradient addition; on text generation, CANDI outperforms masked diffusion at low NFE, demonstrating the value of learning continuous gradients for discrete spaces. We include the code on the project page available here: https://patrickpynadath1.github.io/candi-lander
Abstract（参考訳）: 連続拡散は画像生成のような連続した領域で顕著に成功したが、離散データへの直接的適用は純粋に離散的な定式化に比べて性能が劣っている。このギャップは、連続拡散が複数の位置をまたぐ共同進化を可能にするスコア関数を学ぶことを考えると、直感的ではない。このギャップを理解するために、ガウスノイズが離散的アイデンティティの破損と連続的なランクの劣化という2つのメカニズムを通して離散的データを分解する方法を理解するための分析的枠組みとしてトークン識別可能性を導入する。離散的破壊が条件学習に十分な構造を保っている雑音レベルでは、連続的認知が意味のある雑音レベルでは、離散的破壊がほとんど全ての条件構造を破壊する。この問題を解決するために,離散的・連続的な汚職を分離し,条件構造と連続幾何学の同時学習を可能にするハイブリッドフレームワークであるCANDI(Continuous ANd DIscrete diffusion)を提案する。我々は,時間的不協和現象を実証的に検証し,CANDIが回避できることを実証した。このことは、離散空間に対する連続拡散の利点を解放する:制御された生成において、CANDIは単純な勾配付加を通じて、既製の分類器による分類器ベースのガイダンスを可能にし、テキスト生成において、CANDIは低NFEでのマスク拡散より優れ、離散空間に対する連続勾配の学習価値を示す。 https://patrickpynadath1.github.io/candi-lander

論文の概要: CANDI: Hybrid Discrete-Continuous Diffusion Models

関連論文リスト