Fugu-MT 論文翻訳(概要): Hand-4DGS: Feed-Forward 3D Gaussian Splatting for 4D Hand Reconstruction from Egocentric Videos

論文の概要: Hand-4DGS: Feed-Forward 3D Gaussian Splatting for 4D Hand Reconstruction from Egocentric Videos

arxiv url: http://arxiv.org/abs/2606.19156v1
Date: Wed, 17 Jun 2026 14:58:37 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-18 17:16:51.224899
Title: Hand-4DGS: Feed-Forward 3D Gaussian Splatting for 4D Hand Reconstruction from Egocentric Videos
Title（参考訳）: Hand-4DGS:エゴセントリックビデオからの4次元手指再建のためのフィードフォワード3Dガウス切削
Authors: Jeongmin Bae, Seoha Kim, Marc Pollefeys, Mahdi Rad, Youngjung Uh, Taein Kwon,
Abstract要約: 我々は,エゴセントリックビデオから直接動的4Dハンドを再構築するための最初のフィードフォワードフレームワークであるHand-4DGSを紹介する。提案手法は,動的運動をモデル化するために,構造的先行と時間的畳み込みのためのメッシュ誘導表現を取り入れている。
参考スコア（独自算出の注目度）: 54.612320509347825
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Dynamic 3D hand reconstruction from egocentric videos is essential for next-generation computing platforms such as AR/VR and AI glasses. Despite its importance, most prior works focus either on multi-view 3D hand reconstruction or on 4D human body reconstruction. Egocentric 4D hand reconstruction remains challenging due to fast head motion, rapid hand dynamics, severe occlusions, and inherent ambiguity from single-view observations. To address these challenges, we introduce Hand-4DGS, the first feed-forward framework for reconstructing dynamic 4D hands directly from egocentric videos, enabling both fast (~60 FPS) inference and strong generalization. Our approach incorporates a mesh-guided representation for structural priors and temporal convolutions to model dynamic motion. We evaluate our framework on two challenging egocentric datasets, H2O and ARCTIC, and demonstrate significant improvements over baselines. Our method benefits from the generalization capability of feed-forward networks and effective 2D image supervision through Gaussian splatting, without requiring expensive 3D hand pose ground-truth annotations.
Abstract（参考訳）: エゴセントリックなビデオからのダイナミックな3D手作りは、AR/VRやAIメガネといった次世代コンピューティングプラットフォームにとって不可欠である。その重要性にもかかわらず、多くの先行研究は、多視点3Dハンドコンストラクションまたは4D人体コンストラクションに焦点を当てている。エゴセントリックな4D手指再建は、高速な頭部運動、急速な手の動き、重度の閉塞、単一視野観察による固有のあいまいさにより、依然として困難である。これらの課題に対処するため,我々は,高速(〜60 FPS)の推論と強力な一般化が可能な,ダイナミックな4Dハンドをエゴセントリックなビデオから直接再構成する最初のフィードフォワードフレームワークであるHand-4DGSを紹介した。提案手法は,動的運動をモデル化するために,構造的先行と時間的畳み込みのためのメッシュ誘導表現を取り入れている。我々は,H2OとARCTICという2つの難解なエゴセントリックなデータセット上でフレームワークを評価し,ベースラインよりも大幅に改善したことを示す。提案手法は,高額な3次元手ポーズ・グラウンドトルースアノテーションを必要とせず,フィードフォワードネットワークの一般化能力とガウススプラッティングによる効果的な2次元画像監視の恩恵を受ける。

論文の概要: Hand-4DGS: Feed-Forward 3D Gaussian Splatting for 4D Hand Reconstruction from Egocentric Videos

関連論文リスト