Fugu-MT 論文翻訳(概要): CausNVS: Autoregressive Multi-view Diffusion for Flexible 3D Novel View Synthesis

論文の概要: CausNVS: Autoregressive Multi-view Diffusion for Flexible 3D Novel View Synthesis

arxiv url: http://arxiv.org/abs/2509.06579v1
Date: Mon, 08 Sep 2025 11:49:51 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-09 14:07:04.100426
Title: CausNVS: Autoregressive Multi-view Diffusion for Flexible 3D Novel View Synthesis
Title（参考訳）: CausNVS:フレキシブル3次元新規ビュー合成のための自己回帰多視点拡散
Authors: Xin Kong, Daniel Watson, Yannick Strümpler, Michael Niemeyer, Federico Tombari,
Abstract要約: CausNVSは自動回帰設定における多視点拡散モデルである。任意の入出力ビュー設定をサポートし、順次ビューを生成する。多様な設定で一貫した視覚的品質を実現する。
参考スコア（独自算出の注目度）: 48.43677384182078
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-view diffusion models have shown promise in 3D novel view synthesis, but most existing methods adopt a non-autoregressive formulation. This limits their applicability in world modeling, as they only support a fixed number of views and suffer from slow inference due to denoising all frames simultaneously. To address these limitations, we propose CausNVS, a multi-view diffusion model in an autoregressive setting, which supports arbitrary input-output view configurations and generates views sequentially. We train CausNVS with causal masking and per-frame noise, using pairwise-relative camera pose encodings (CaPE) for precise camera control. At inference time, we combine a spatially-aware sliding-window with key-value caching and noise conditioning augmentation to mitigate drift. Our experiments demonstrate that CausNVS supports a broad range of camera trajectories, enables flexible autoregressive novel view synthesis, and achieves consistently strong visual quality across diverse settings. Project page: https://kxhit.github.io/CausNVS.html.
Abstract（参考訳）: マルチビュー拡散モデルは、3次元の新規なビュー合成において有望であるが、既存のほとんどの手法では非自己回帰的定式化を採用している。これにより、固定数のビューしかサポートせず、すべてのフレームを同時に飾ることによって推論が遅くなるため、ワールドモデリングにおける適用性が制限される。これらの制約に対処するために,任意の入出力ビュー設定をサポートし,順次ビューを生成する自動回帰設定における多視点拡散モデルであるCausNVSを提案する。我々はCausNVSを因果マスキングとフレームごとのノイズで訓練し、ペアワイズ・リレーショナル・カメラ・ポーズ・エンコーディング(CaPE)を用いて正確なカメラ制御を行う。推定時,空間認識型スライドウインドウとキー値キャッシングと雑音条件強化を組み合わせることでドリフトを緩和する。実験により,CausNVSは広い範囲のカメラトラジェクトリをサポートし,フレキシブルな自己回帰型ノベルビュー合成を実現し,多様な設定で一貫した視覚的品質を実現することができた。プロジェクトページ: https://kxhit.github.io/CausNVS.html

論文の概要: CausNVS: Autoregressive Multi-view Diffusion for Flexible 3D Novel View Synthesis

関連論文リスト