Fugu-MT 論文翻訳(概要): $Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models

論文の概要: $Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models

arxiv url: http://arxiv.org/abs/2604.23536v1
Date: Sun, 26 Apr 2026 05:16:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-28 17:12:07.41648
Title: $Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models
Title（参考訳）: 拡散モデルにおける意味的アライメントのためのZ^2$-Sampling: Zero-Cost Zigzag Trajectories
Authors: Haosen Li, Wenshuo Chen, Shaofeng Liang, Lei Wang, Kaishen Yuan, Yutao Yue,
Abstract要約: インプリシット Z-サンプリングは、中間状態が作用素双対性によって代数的に消滅できることを証明する。 Z2$-Samplingのカップルは動的にキャッシュされたテンポラルセマンティックサロゲートで暗黙の代数的崩壊を行う。
参考スコア（独自算出の注目度）: 6.21141073537668
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Diffusion models have achieved unprecedented success in text-aligned generation, largely driven by Classifier-Free Guidance (CFG). However, standard CFG operates strictly on instantaneous gradients, omitting the intrinsic curvature of the data manifold. Recent methods like Zigzag-sampling (Z-Sampling) explicitly traverse multi-step forward-backward trajectories to probe this curvature, significantly improving semantic alignment. Yet, these explicit traversals triple the Neural Function Evaluation (NFE) cost and introduce unconstrained truncation errors from off-manifold evaluations, causing cumulative drift from the true marginal distribution. In this paper, we theoretically demonstrate that the explicit zigzag sequence is topologically reducible. We propose Implicit Z-Sampling, rigorously proving that intermediate states can be algebraically annihilated via operator dualities, physically eliminating off-manifold approximation errors. To push sampling efficiency to its theoretical lower bound, we introduce $Z^2$-Sampling (Zero-cost Zigzag Sampling). Exploiting the Probability Flow ODE's temporal coherence, $Z^2$-Sampling couples implicit algebraic collapse with a dynamically cached Temporal Semantic Surrogate. This restores the standard 2-NFE baseline without sacrificing semantic exploration. We formally prove via Backward Error Analysis that this discrete collapse inherently synthesizes a directional derivative curvature penalty. Finally, extensive evaluations demonstrate that $Z^2$-Sampling structurally shatters the performance-efficiency Pareto frontier. We validate its universal applicability across diverse architectures (U-Nets, DiTs) and modalities (image/video), establishing seamless orthogonality with advanced alignment frameworks (AYS, Diffusion-DPO).
Abstract（参考訳）: 拡散モデルはテキスト・アライン・ジェネレーションにおいて前例のない成功を収めており、主に分類自由誘導(CFG)によって推進されている。しかし、標準CFGは、データ多様体の内在曲率を省略し、即時勾配を厳密に操作する。 Zigzag-sampling (Z-Sampling)のような最近の手法は、この曲率を探索するために、多段階の後方軌道を明示的に横切ることで、意味的アライメントを著しく改善している。しかし、これらの明示的なトラバーサルはニューラルファンクション評価(NFE)コストを3倍にし、オフマンド評価から制約のないトランケーション誤差を導入し、真の限界分布からの累積ドリフトを引き起こす。本稿では,明示的なジグザグ列が位相的に再現可能であることを理論的に示す。我々は、中間状態が演算子双対性によって代数的に消滅できることを厳密に証明し、オフマニフォールド近似誤差を物理的に排除するImplicit Z-Samplingを提案する。サンプリング効率を理論的下界に押し上げるために,Z^2$-Sampling (Zero-cost Zigzag Smpling)を導入する。確率フローODEの時間コヒーレンスを爆発させると、$Z^2$-Samplingのカップルは動的にキャッシュされたテンポラルセマンティックサロゲートで暗黙の代数的崩壊を起こす。これにより、意味探索を犠牲にすることなく標準の2-NFEベースラインを復元する。我々は、後方誤差解析により、この離散的な崩壊が本質的に方向微分曲率のペナルティを合成することを正式に証明する。最後に、Z^2$-Samplingが性能効率のParetoフロンティアを構造的に破壊することを示す。多様なアーキテクチャ (U-Nets, DiTs) とモダリティ (image/video) にまたがる普遍的適用性を検証し, 高度なアライメントフレームワーク (AYS, Diffusion-DPO) とのシームレスな直交性を確立する。

論文の概要: $Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models

関連論文リスト