Fugu-MT 論文翻訳(概要): DuoMo: Dual Motion Diffusion for World-Space Human Reconstruction

論文の概要: DuoMo: Dual Motion Diffusion for World-Space Human Reconstruction

arxiv url: http://arxiv.org/abs/2603.03265v1
Date: Tue, 03 Mar 2026 18:54:17 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-04 21:38:10.938377
Title: DuoMo: Dual Motion Diffusion for World-Space Human Reconstruction
Title（参考訳）: DuoMo:世界空間の人体再構成のためのデュアルモーション拡散
Authors: Yufu Wang, Evonne Ng, Soyong Shin, Rawal Khirodkar, Yuan Dong, Zhaoen Su, Jinhyung Park, Kris Kitani, Alexander Richard, Fabian Prada, Michael Zollhofer,
Abstract要約: DuoMoは、ノイズや不完全な観察で制約のないビデオから世界空間の座標で人間の動きを復元する生成方法である。本手法は,運動学習を2つの拡散モデルに分解することでこの問題に対処する。この2つのモデルは、ノイズや不完全な観察からでも、様々なシーンや軌道をまたいで動きを再構築することができる。
参考スコア（独自算出の注目度）: 73.7305982336243
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present DuoMo, a generative method that recovers human motion in world-space coordinates from unconstrained videos with noisy or incomplete observations. Reconstructing such motion requires solving a fundamental trade-off: generalizing from diverse and noisy video inputs while maintaining global motion consistency. Our approach addresses this problem by factorizing motion learning into two diffusion models. The camera-space model first estimates motion from videos in camera coordinates. The world-space model then lifts this initial estimate into world coordinates and refines it to be globally consistent. Together, the two models can reconstruct motion across diverse scenes and trajectories, even from highly noisy or incomplete observations. Moreover, our formulation is general, generating the motion of mesh vertices directly and bypassing parametric models. DuoMo achieves state-of-the-art performance. On EMDB, our method obtains a 16% reduction in world-space reconstruction error while maintaining low foot skating. On RICH, it obtains a 30% reduction in world-space error. Project page: https://yufu-wang.github.io/duomo/
Abstract（参考訳）: ノイズや不完全な観察を伴う非拘束映像から世界空間座標における人間の動きを復元する生成法であるDuoMoを提案する。このような動きを再構築するには、基本的なトレードオフ、つまり、グローバルな動きの一貫性を維持しながら、多様でノイズの多いビデオ入力から一般化する必要がある。本手法は,運動学習を2つの拡散モデルに分解することでこの問題に対処する。カメラ空間モデルはまず、カメラ座標のビデオから動きを推定する。世界空間モデルは、この最初の見積もりを世界座標に引き上げ、世界的一貫性を持つように洗練する。この2つのモデルは、ノイズや不完全な観察からでも、様々なシーンや軌道をまたいで動きを再構築することができる。さらに、我々の定式化は一般的なもので、メッシュ頂点の運動を直接生成し、パラメトリックモデルをバイパスする。 DuoMoは最先端のパフォーマンスを達成する。 EMDBでは,ローフットスケートを維持しながら世界空間再構成誤差を16%低減する。 RICHでは、世界空間誤差が30%減少する。プロジェクトページ:https://yufu-wang.github.io/duomo/

論文の概要: DuoMo: Dual Motion Diffusion for World-Space Human Reconstruction

関連論文リスト