MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer
- URL: http://arxiv.org/abs/2603.05078v1
- Date: Thu, 05 Mar 2026 11:51:07 GMT
- Title: MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer
- Authors: Juntong Fang, Zequn Chen, Weiqi Zhang, Donglin Di, Xuancheng Zhang, Chengmin Yang, Yu-Shen Liu,
- Abstract summary: MoRe is a feedforward 4D reconstruction network that efficiently recovers dynamic 3D scenes from monocular videos.<n>Built upon a strong static reconstruction backbone, MoRe employs an attention-forcing strategy to disentangle dynamic motion from static structure.<n>Experiments on multiple benchmarks demonstrate that MoRe achieves high-quality dynamic reconstructions with exceptional efficiency.
- Score: 45.19539316971492
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reconstructing dynamic 4D scenes remains challenging due to the presence of moving objects that corrupt camera pose estimation. Existing optimization methods alleviate this issue with additional supervision, but they are mostly computationally expensive and impractical in real-time applications. To address these limitations, we propose MoRe, a feedforward 4D reconstruction network that efficiently recovers dynamic 3D scenes from monocular videos. Built upon a strong static reconstruction backbone, MoRe employs an attention-forcing strategy to disentangle dynamic motion from static structure. To further enhance robustness, we fine-tune the model on large-scale, diverse datasets encompassing both dynamic and static scenes. Moreover, our grouped causal attention captures temporal dependencies and adapts to varying token lengths across frames, ensuring temporally coherent geometry reconstruction. Extensive experiments on multiple benchmarks demonstrate that MoRe achieves high-quality dynamic reconstructions with exceptional efficiency.
Related papers
- Gaussian Sequences with Multi-Scale Dynamics for 4D Reconstruction from Monocular Casual Videos [7.422432435797114]
Real-world dynamics exhibits a multi-scale regularity from object to particle level.<n>We design the multi-scale dynamics mechanism that factorizes complex motion fields.<n>Our approach enables accurate and globally consistent 4D reconstruction from monocular casual videos.
arXiv Detail & Related papers (2026-02-14T14:30:25Z) - 4D Primitive-Mâché: Glueing Primitives for Persistent 4D Scene Reconstruction [28.50411933478524]
We present a dynamic reconstruction system that receives a casual monocular RGB video as input, and outputs a persistent reconstruction of the scene.<n>In other words, we reconstruct not only the the currently visible parts of the scene, but also all previously viewed parts, which enables replaying the complete reconstruction across all timesteps.
arXiv Detail & Related papers (2025-12-18T14:06:15Z) - 4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos [52.89084603734664]
We present 4D3R, a pose-free dynamic neural rendering framework that decouples static and dynamic components through a two-stage approach.<n>Our approach achieves up to 1.8dB PSNR improvement over state-of-the-art methods.
arXiv Detail & Related papers (2025-11-07T13:25:50Z) - PAGE-4D: Disentangled Pose and Geometry Estimation for 4D Perception [39.819707648812944]
PAGE-4D is a feedforward model that extends VGGT to dynamic scenes without post-processing.<n>It disentangles static and dynamic information by predicting a dynamics-aware mask.<n>Experiments show that PAGE-4D consistently outperforms the original VGGT in dynamic scenarios.
arXiv Detail & Related papers (2025-10-20T14:17:16Z) - C4D: 4D Made from 3D through Dual Correspondences [77.04731692213663]
We introduce C4D, a framework that leverages temporal correspondences to extend existing 3D reconstruction formulation to 4D.<n>C4D captures two types of correspondences: short-term optical flow and long-term point tracking.<n>We train a dynamic-aware point tracker that provides additional mobility information.
arXiv Detail & Related papers (2025-10-16T17:59:06Z) - D$^2$USt3R: Enhancing 3D Reconstruction for Dynamic Scenes [54.886845755635754]
This work addresses the task of 3D reconstruction in dynamic scenes, where object motions frequently degrade the quality of previous 3D pointmap regression methods.<n>By explicitly incorporating both spatial and temporal aspects, our approach successfully encapsulates 3D dense correspondence to the proposed pointmaps.
arXiv Detail & Related papers (2025-04-08T17:59:50Z) - Feed-Forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos [110.3924779333809]
We present BTimer, the first motion-aware feed-forward model for real-time reconstruction and novel view synthesis of dynamic scenes.<n>Our approach reconstructs the full scene in a 3D Gaussian Splatting representation at a given target ('bullet') timestamp by aggregating information from all the context frames.<n>Given a casual monocular dynamic video, BTimer reconstructs a bullet-time scene within 150ms while reaching state-of-the-art performance on both static and dynamic scene datasets.
arXiv Detail & Related papers (2024-12-04T18:15:06Z) - DRSM: efficient neural 4d decomposition for dynamic reconstruction in
stationary monocular cameras [21.07910546072467]
We present a novel framework to tackle 4D decomposition problem for dynamic scenes in monocular cameras.
Our framework utilizes decomposed static and dynamic feature planes to represent 4D scenes and emphasizes the learning of dynamic regions through dense ray casting.
arXiv Detail & Related papers (2024-02-01T16:38:51Z) - Class-agnostic Reconstruction of Dynamic Objects from Videos [127.41336060616214]
We introduce REDO, a class-agnostic framework to REconstruct the Dynamic Objects from RGBD or calibrated videos.
We develop two novel modules. First, we introduce a canonical 4D implicit function which is pixel-aligned with aggregated temporal visual cues.
Second, we develop a 4D transformation module which captures object dynamics to support temporal propagation and aggregation.
arXiv Detail & Related papers (2021-12-03T18:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.