Fugu-MT 論文翻訳(概要): VDNeRF: Vision-only Dynamic Neural Radiance Field for Urban Scenes

論文の概要: VDNeRF: Vision-only Dynamic Neural Radiance Field for Urban Scenes

arxiv url: http://arxiv.org/abs/2511.06408v1
Date: Sun, 09 Nov 2025 14:45:08 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-11 21:18:44.929401
Title: VDNeRF: Vision-only Dynamic Neural Radiance Field for Urban Scenes
Title（参考訳）: VDNeRF:都市景観のための視覚のみの動的ニューラル放射場
Authors: Zhengyu Zou, Jingfeng Li, Hao Li, Xiaolei Hou, Jinwen Hu, Jingkun Chen, Lechao Cheng, Dingwen Zhang,
Abstract要約: 視覚のみのダイナミックNeRF(VDRF)は、カメラの軌跡を復元し、動的都市景観の時間的表現を学習する手法である。 VDNeRFは、カメラポーズ推定とダイナミックノベルビュー合成の両方において、最先端のNeRFベースのポーズフリー手法を超越している。
参考スコア（独自算出の注目度）: 41.59812880106718
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural Radiance Fields (NeRFs) implicitly model continuous three-dimensional scenes using a set of images with known camera poses, enabling the rendering of photorealistic novel views. However, existing NeRF-based methods encounter challenges in applications such as autonomous driving and robotic perception, primarily due to the difficulty of capturing accurate camera poses and limitations in handling large-scale dynamic environments. To address these issues, we propose Vision-only Dynamic NeRF (VDNeRF), a method that accurately recovers camera trajectories and learns spatiotemporal representations for dynamic urban scenes without requiring additional camera pose information or expensive sensor data. VDNeRF employs two separate NeRF models to jointly reconstruct the scene. The static NeRF model optimizes camera poses and static background, while the dynamic NeRF model incorporates the 3D scene flow to ensure accurate and consistent reconstruction of dynamic objects. To address the ambiguity between camera motion and independent object motion, we design an effective and powerful training framework to achieve robust camera pose estimation and self-supervised decomposition of static and dynamic elements in a scene. Extensive evaluations on mainstream urban driving datasets demonstrate that VDNeRF surpasses state-of-the-art NeRF-based pose-free methods in both camera pose estimation and dynamic novel view synthesis.
Abstract（参考訳）: ニューラル・ラジアンス・フィールド(NeRF)は、既知のカメラポーズを持つ一連の画像を用いて、連続した3次元のシーンを暗黙的にモデル化し、フォトリアリスティックなノベルビューのレンダリングを可能にする。しかし、既存のNeRFベースの手法は、自律運転やロボット知覚などの応用において、カメラの正確なポーズを捉えるのが困難であることや、大規模な動的環境を扱う際の限界によって、課題に直面している。これらの問題に対処するために,視覚のみのダイナミックNeRF (VDNeRF) を提案する。カメラのポーズ情報や高価なセンサデータを必要とすることなく,カメラの軌道を正確に復元し,動的都市景観の時空間表現を学習する手法である。 VDNeRFは2つの別々のNeRFモデルを使用してシーンを共同で再構築する。静的NeRFモデルはカメラポーズと静的背景を最適化し、動的NeRFモデルは3次元シーンフローを取り入れ、動的オブジェクトの正確かつ一貫した再構成を保証する。カメラの動きと独立物体の動きのあいまいさに対処するため,ロバストなカメラポーズ推定と静的および動的要素の自己監督的分解を実現するための,効果的で強力なトレーニングフレームワークを設計する。 VDNeRFは、カメラポーズ推定とダイナミックノベルビュー合成の両方において、最先端のNeRFベースのポーズフリー手法を超越していることを示す。

論文の概要: VDNeRF: Vision-only Dynamic Neural Radiance Field for Urban Scenes

関連論文リスト