Related papers: Flow4R: Unifying 4D Reconstruction and Tracking with Scene Flow

Flow4R: Unifying 4D Reconstruction and Tracking with Scene Flow

URL: http://arxiv.org/abs/2602.14021v1
Date: Sun, 15 Feb 2026 06:58:08 GMT
Title: Flow4R: Unifying 4D Reconstruction and Tracking with Scene Flow
Authors: Shenhan Qian, Ganlin Zhang, Shangzhe Wu, Daniel Cremers,
Abstract summary: Flow4R predicts a minimal per-pixel property set-3D point position, scene flow, pose weight, and confidence-from two-view inputs using a Vision Transformer.<n> trained jointly on static and dynamic datasets, Flow4R achieves state-of-the-art performance on 4D reconstruction and tracking tasks.
Score: 61.297800738187355
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Reconstructing and tracking dynamic 3D scenes remains a fundamental challenge in computer vision. Existing approaches often decouple geometry from motion: multi-view reconstruction methods assume static scenes, while dynamic tracking frameworks rely on explicit camera pose estimation or separate motion models. We propose Flow4R, a unified framework that treats camera-space scene flow as the central representation linking 3D structure, object motion, and camera motion. Flow4R predicts a minimal per-pixel property set-3D point position, scene flow, pose weight, and confidence-from two-view inputs using a Vision Transformer. This flow-centric formulation allows local geometry and bidirectional motion to be inferred symmetrically with a shared decoder in a single forward pass, without requiring explicit pose regressors or bundle adjustment. Trained jointly on static and dynamic datasets, Flow4R achieves state-of-the-art performance on 4D reconstruction and tracking tasks, demonstrating the effectiveness of the flow-central representation for spatiotemporal scene understanding.

Related papers

RU4D-SLAM: Reweighting Uncertainty in Gaussian Splatting SLAM for 4D Scene Reconstruction [8.13353479857245]
4D reconstruction, especially 4D Gaussian splatting, offers a promising direction for addressing these challenges.<n>We propose a robust and efficient framework, namely Reweighting Uncertainty in Gaussian Splatting SLAM (RU4D-SLAM) for 4D scene reconstruction.<n>Our method substantially outperforms state-of-the-art approaches in both trajectory accuracy and 4D scene reconstruction.
arXiv Detail & Related papers (2026-02-24T11:47:43Z)
4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos [52.89084603734664]
We present 4D3R, a pose-free dynamic neural rendering framework that decouples static and dynamic components through a two-stage approach.<n>Our approach achieves up to 1.8dB PSNR improvement over state-of-the-art methods.
arXiv Detail & Related papers (2025-11-07T13:25:50Z)
C4D: 4D Made from 3D through Dual Correspondences [77.04731692213663]
We introduce C4D, a framework that leverages temporal correspondences to extend existing 3D reconstruction formulation to 4D.<n>C4D captures two types of correspondences: short-term optical flow and long-term point tracking.<n>We train a dynamic-aware point tracker that provides additional mobility information.
arXiv Detail & Related papers (2025-10-16T17:59:06Z)
Estimating 2D Camera Motion with Hybrid Motion Basis [45.971928868591334]
CamFlow is a novel framework that represents camera motion using hybrid motion bases.<n>Our approach includes a hybrid probabilistic loss function based on the Laplace distribution.<n>Experiments show CamFlow outperforms state-of-the-art methods across diverse scenarios.
arXiv Detail & Related papers (2025-07-30T08:30:37Z)
Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction [86.099855111676]
Traditional SLAM systems struggle with highly dynamic scenes commonly found in casual videos.<n>This work leverages a 3D point tracker to separate the camera-induced motion from the observed motion of dynamic objects.<n>Our framework combines the core of traditional SLAM -- bundle adjustment -- with a robust learning-based 3D tracker front-end.
arXiv Detail & Related papers (2025-04-20T07:29:42Z)
St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World [106.91539872943864]
St4RTrack is a framework that simultaneously reconstructs and tracks dynamic video content in a world coordinate frame from RGB inputs.<n>We predict both pointmaps at the same moment, in the same world, capturing both static and dynamic scene geometry.<n>We establish a new extensive benchmark for world-frame reconstruction and tracking, demonstrating the effectiveness and efficiency of our unified, data-driven framework.
arXiv Detail & Related papers (2025-04-17T17:55:58Z)
D$^2$USt3R: Enhancing 3D Reconstruction for Dynamic Scenes [54.886845755635754]
This work addresses the task of 3D reconstruction in dynamic scenes, where object motions frequently degrade the quality of previous 3D pointmap regression methods.<n>By explicitly incorporating both spatial and temporal aspects, our approach successfully encapsulates 3D dense correspondence to the proposed pointmaps.
arXiv Detail & Related papers (2025-04-08T17:59:50Z)
Easi3R: Estimating Disentangled Motion from DUSt3R Without Training [69.51086319339662]
We introduce Easi3R, a simple yet efficient training-free method for 4D reconstruction.<n>Our approach applies attention adaptation during inference, eliminating the need for from-scratch pre-training or network fine-tuning.<n>Our experiments on real-world dynamic videos demonstrate that our lightweight attention adaptation significantly outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2025-03-31T17:59:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.