Related papers: 4DRadar-GS: Self-Supervised Dynamic Driving Scene Reconstruction with 4D Radar

4DRadar-GS: Self-Supervised Dynamic Driving Scene Reconstruction with 4D Radar

URL: http://arxiv.org/abs/2509.12931v1
Date: Tue, 16 Sep 2025 10:29:43 GMT
Title: 4DRadar-GS: Self-Supervised Dynamic Driving Scene Reconstruction with 4D Radar
Authors: Xiao Tang, Guirong Zhuo, Cong Wang, Boyuan Zheng, Minqing Huang, Lianqing Zheng, Long Chen, Shouyi Lu,
Abstract summary: We present a 4D Radar-augmented self-supervised 3D reconstruction framework tailored for dynamic driving scenes.<n>4DRadar-GS achieves state-of-the-art performance in dynamic driving scene 3D reconstruction.
Score: 15.713470339586058
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 3D reconstruction and novel view synthesis are critical for validating autonomous driving systems and training advanced perception models. Recent self-supervised methods have gained significant attention due to their cost-effectiveness and enhanced generalization in scenarios where annotated bounding boxes are unavailable. However, existing approaches, which often rely on frequency-domain decoupling or optical flow, struggle to accurately reconstruct dynamic objects due to imprecise motion estimation and weak temporal consistency, resulting in incomplete or distorted representations of dynamic scene elements. To address these challenges, we propose 4DRadar-GS, a 4D Radar-augmented self-supervised 3D reconstruction framework tailored for dynamic driving scenes. Specifically, we first present a 4D Radar-assisted Gaussian initialization scheme that leverages 4D Radar's velocity and spatial information to segment dynamic objects and recover monocular depth scale, generating accurate Gaussian point representations. In addition, we propose a Velocity-guided PointTrack (VGPT) model, which is jointly trained with the reconstruction pipeline under scene flow supervision, to track fine-grained dynamic trajectories and construct temporally consistent representations. Evaluated on the OmniHD-Scenes dataset, 4DRadar-GS achieves state-of-the-art performance in dynamic driving scene 3D reconstruction.

Related papers

RU4D-SLAM: Reweighting Uncertainty in Gaussian Splatting SLAM for 4D Scene Reconstruction [8.13353479857245]
4D reconstruction, especially 4D Gaussian splatting, offers a promising direction for addressing these challenges.<n>We propose a robust and efficient framework, namely Reweighting Uncertainty in Gaussian Splatting SLAM (RU4D-SLAM) for 4D scene reconstruction.<n>Our method substantially outperforms state-of-the-art approaches in both trajectory accuracy and 4D scene reconstruction.
arXiv Detail & Related papers (2026-02-24T11:47:43Z)
Flow4R: Unifying 4D Reconstruction and Tracking with Scene Flow [61.297800738187355]
Flow4R predicts a minimal per-pixel property set-3D point position, scene flow, pose weight, and confidence-from two-view inputs using a Vision Transformer.<n> trained jointly on static and dynamic datasets, Flow4R achieves state-of-the-art performance on 4D reconstruction and tracking tasks.
arXiv Detail & Related papers (2026-02-15T06:58:08Z)
Flux4D: Flow-based Unsupervised 4D Reconstruction [30.764886648248222]
Reconstructing large-scale dynamic scenes from visual observations is a fundamental challenge in computer vision.<n>We introduce Flux4D, a simple and scalable framework for 4D reconstruction of large-scale dynamic scenes.<n>Our approach enables efficient reconstruction of dynamic scenes within seconds, scales effectively to large datasets, and generalizes well to unseen environments.
arXiv Detail & Related papers (2025-12-02T20:28:45Z)
C4D: 4D Made from 3D through Dual Correspondences [77.04731692213663]
We introduce C4D, a framework that leverages temporal correspondences to extend existing 3D reconstruction formulation to 4D.<n>C4D captures two types of correspondences: short-term optical flow and long-term point tracking.<n>We train a dynamic-aware point tracker that provides additional mobility information.
arXiv Detail & Related papers (2025-10-16T17:59:06Z)
Easi3R: Estimating Disentangled Motion from DUSt3R Without Training [48.87063562819018]
We introduce Easi3R, a simple yet efficient training-free method for 4D reconstruction.<n>Our approach applies attention adaptation during inference, eliminating the need for from-scratch pre-training or network fine-tuning.<n>Our experiments on real-world dynamic videos demonstrate that our lightweight attention adaptation significantly outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2025-03-31T17:59:58Z)
CoDa-4DGS: Dynamic Gaussian Splatting with Context and Deformation Awareness for Autonomous Driving [12.006435326659526]
We introduce a novel 4D Gaussian Splatting (4DGS) approach to improve dynamic scene rendering.<n> Specifically, we employ a 2D semantic segmentation foundation model to self-supervise the 4D semantic features of Gaussians.<n>By aggregating and encoding both semantic and temporal deformation features, each Gaussian is equipped with cues for potential deformation compensation.
arXiv Detail & Related papers (2025-03-09T19:58:51Z)
STORM: Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes [47.4799413169038]
STORM is atemporal reconstruction model designed for reconstructing dynamic outdoor scenes from sparse observations.<n>We show that STORM achieves precise dynamic scene reconstruction, surpassing state-of-the-art per-scene optimization methods.<n>We also showcase four additional applications of our model, illustrating the potential of self-supervised learning for broader dynamic scene understanding.
arXiv Detail & Related papers (2024-12-31T18:59:58Z)
4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives [115.67081491747943]
Dynamic 3D scene representation and novel view synthesis are crucial for enabling AR/VR and metaverse applications.<n>We reformulate the reconstruction of a time-varying 3D scene as approximating its underlying 4D volume.<n>We derive several compact variants that effectively reduce the memory footprint to address its storage bottleneck.
arXiv Detail & Related papers (2024-12-30T05:30:26Z)
Driv3R: Learning Dense 4D Reconstruction for Autonomous Driving [116.10577967146762]
We propose Driv3R, a framework that directly regresses per-frame point maps from multi-view image sequences.<n>We employ a 4D flow predictor to identify moving objects within the scene to direct our network focus more on reconstructing these dynamic regions.<n>Driv3R outperforms previous frameworks in 4D dynamic scene reconstruction, achieving 15x faster inference speed.
arXiv Detail & Related papers (2024-12-09T18:58:03Z)
UrbanGS: Semantic-Guided Gaussian Splatting for Urban Scene Reconstruction [86.4386398262018]
UrbanGS uses 2D semantic maps and an existing dynamic Gaussian approach to distinguish static objects from the scene.<n>For potentially dynamic objects, we aggregate temporal information using learnable time embeddings.<n>Our approach outperforms state-of-the-art methods in reconstruction quality and efficiency.
arXiv Detail & Related papers (2024-12-04T16:59:49Z)
Dynamics-Aware Gaussian Splatting Streaming Towards Fast On-the-Fly 4D Reconstruction [15.588032729272536]
Current 3DGS-based streaming methods treat the Gaussian primitives uniformly and constantly renew the densified Gaussians.<n>We propose a novel three-stage pipeline for iterative streamable 4D dynamic spatial reconstruction.<n>Our method achieves state-of-the-art performance in online 4D reconstruction, demonstrating the fastest on-the-fly training, superior representation quality, and real-time rendering capability.
arXiv Detail & Related papers (2024-11-22T10:47:47Z)
S4D: Streaming 4D Real-World Reconstruction with Gaussians and 3D Control Points [30.46796069720543]
We introduce a novel approach for streaming 4D real-world reconstruction utilizing discrete 3D control points. This method physically models local rays and establishes a motion-decoupling coordinate system. By effectively merging traditional graphics with learnable pipelines, it provides a robust and efficient local 6-degrees-of-freedom (6 DoF) motion representation.
arXiv Detail & Related papers (2024-08-23T12:51:49Z)
EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision [85.17951804790515]
EmerNeRF is a simple yet powerful approach for learning spatial-temporal representations of dynamic driving scenes. It simultaneously captures scene geometry, appearance, motion, and semantics via self-bootstrapping. Our method achieves state-of-the-art performance in sensor simulation.
arXiv Detail & Related papers (2023-11-03T17:59:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.