ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation
- URL: http://arxiv.org/abs/2503.18438v1
- Date: Mon, 24 Mar 2025 08:40:20 GMT
- Title: ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation
- Authors: Guosheng Zhao, Xiaofeng Wang, Chaojun Ni, Zheng Zhu, Wenkang Qin, Guan Huang, Xingang Wang,
- Abstract summary: ReconDreamer has demonstrated remarkable success in rendering large-scale maneuvers.<n>A significant gap remains between the generated data and real-world sensor observations.<n>We propose ReconDreamer++, an enhanced framework that significantly improves the overall rendering quality.<n>In particular, it achieves substantial improvements, including a 6.1% increase in NTA-IoU, a 23. 0% improvement in FID, and a remarkable 4.5% gain in the ground surface metric NTL-IoU.
- Score: 30.16598076671646
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Combining reconstruction models with generative models has emerged as a promising paradigm for closed-loop simulation in autonomous driving. For example, ReconDreamer has demonstrated remarkable success in rendering large-scale maneuvers. However, a significant gap remains between the generated data and real-world sensor observations, particularly in terms of fidelity for structured elements, such as the ground surface. To address these challenges, we propose ReconDreamer++, an enhanced framework that significantly improves the overall rendering quality by mitigating the domain gap and refining the representation of the ground surface. Specifically, ReconDreamer++ introduces the Novel Trajectory Deformable Network (NTDNet), which leverages learnable spatial deformation mechanisms to bridge the domain gap between synthesized novel views and original sensor observations. Moreover, for structured elements such as the ground surface, we preserve geometric prior knowledge in 3D Gaussians, and the optimization process focuses on refining appearance attributes while preserving the underlying geometric structure. Experimental evaluations conducted on multiple datasets (Waymo, nuScenes, PandaSet, and EUVS) confirm the superior performance of ReconDreamer++. Specifically, on Waymo, ReconDreamer++ achieves performance comparable to Street Gaussians for the original trajectory while significantly outperforming ReconDreamer on novel trajectories. In particular, it achieves substantial improvements, including a 6.1% increase in NTA-IoU, a 23. 0% improvement in FID, and a remarkable 4.5% gain in the ground surface metric NTL-IoU, highlighting its effectiveness in accurately reconstructing structured elements such as the road surface.
Related papers
- Enhancing Steering Estimation with Semantic-Aware GNNs [41.89219383258699]
hybrid architectures combine 3D neural network models with recurrent neural networks (RNNs) for temporal modeling.<n>We evaluate four hybrid 3D models, all of which outperform the 2D-only baseline.<n>We validate our approach on the KITTI dataset, achieving a 71% improvement over 2D-only models.
arXiv Detail & Related papers (2025-03-21T13:58:08Z) - REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints [48.80178020541189]
REArtGS is a novel framework that introduces additional geometric and motion constraints to 3D Gaussian primitives.<n>We establish deformable fields for 3D Gaussians constrained by the kinematic structures of articulated objects, achieving unsupervised generation of surface meshes in unseen states.
arXiv Detail & Related papers (2025-03-09T16:05:36Z) - DreamMask: Boosting Open-vocabulary Panoptic Segmentation with Synthetic Data [61.62554324594797]
We propose DreamMask, which explores how to generate training data in the open-vocabulary setting, and how to train the model with both real and synthetic data.<n>In general, DreamMask significantly simplifies the collection of large-scale training data, serving as a plug-and-play enhancement for existing methods.<n>For instance, when trained on COCO and tested on ADE20K, the model equipped with DreamMask outperforms the previous state-of-the-art by a substantial margin of 2.1% mIoU.
arXiv Detail & Related papers (2025-01-03T19:00:00Z) - Uni-SLAM: Uncertainty-Aware Neural Implicit SLAM for Real-Time Dense Indoor Scene Reconstruction [11.714682609560278]
We propose Uni-SLAM, a decoupled 3D spatial representation based on hash grids for indoor reconstruction.<n> Experiments on synthetic and real-world datasets demonstrate that our system achieves state-of-the-art tracking and mapping accuracy.
arXiv Detail & Related papers (2024-11-29T20:16:58Z) - ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration [30.755679955159504]
ReconDreamer enhances driving scene reconstruction through incremental integration of world model knowledge.<n>To the best of our knowledge, ReconDreamer is the first method to effectively render in large maneuvers.
arXiv Detail & Related papers (2024-11-29T08:47:46Z) - GausSurf: Geometry-Guided 3D Gaussian Splatting for Surface Reconstruction [79.42244344704154]
GausSurf employs geometry guidance from multi-view consistency in texture-rich areas and normal priors in texture-less areas of a scene.
Our method surpasses state-of-the-art methods in terms of reconstruction quality and computation time.
arXiv Detail & Related papers (2024-11-29T03:54:54Z) - SMORE: Simulataneous Map and Object REconstruction [66.66729715211642]
We present a method for dynamic surface reconstruction of large-scale urban scenes from LiDAR.<n>We take a holistic perspective and optimize a compositional model of a dynamic scene that decomposes the world into rigidly-moving objects and the background.
arXiv Detail & Related papers (2024-06-19T23:53:31Z) - RaNeuS: Ray-adaptive Neural Surface Reconstruction [87.20343320266215]
We leverage a differentiable radiance field eg NeRF to reconstruct detailed 3D surfaces in addition to producing novel view renderings.
Considering that different methods formulate and optimize the projection from SDF to radiance field with a globally constant Eikonal regularization, we improve with a ray-wise weighting factor.
Our proposed textitRaNeuS are extensively evaluated on both synthetic and real datasets.
arXiv Detail & Related papers (2024-06-14T07:54:25Z) - GaussianRoom: Improving 3D Gaussian Splatting with SDF Guidance and Monocular Cues for Indoor Scene Reconstruction [5.112375652774415]
We propose a unified optimization framework that integrates neural signed distance fields (SDFs) with 3DGS for accurate geometry reconstruction and real-time rendering.
Our method achieves state-of-the-art performance in both surface reconstruction and novel view synthesis.
arXiv Detail & Related papers (2024-05-30T03:46:59Z) - UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation [101.2317840114147]
We present UniDream, a text-to-3D generation framework by incorporating unified diffusion priors.
Our approach consists of three main components: (1) a dual-phase training process to get albedo-normal aligned multi-view diffusion and reconstruction models, (2) a progressive generation procedure for geometry and albedo-textures based on Score Distillation Sample (SDS) using the trained reconstruction and diffusion models, and (3) an innovative application of SDS for finalizing PBR generation while keeping a fixed albedo based on Stable Diffusion model.
arXiv Detail & Related papers (2023-12-14T09:07:37Z) - MuSHRoom: Multi-Sensor Hybrid Room Dataset for Joint 3D Reconstruction and Novel View Synthesis [26.710960922302124]
We propose a real-world Multi-Sensor Hybrid Room dataset (MuSHRoom)<n>Our dataset presents exciting challenges and requires state-of-the-art methods to be cost-effective, robust to noisy data and devices.<n>We benchmark several famous pipelines on our dataset for joint 3D mesh reconstruction and novel view synthesis.
arXiv Detail & Related papers (2023-11-05T21:46:12Z) - Quaternion-Based Graph Convolution Network for Recommendation [45.005089037955536]
Graph Convolution Network (GCN) has been widely applied in recommender systems.
GCN is vulnerable to noisy and incomplete graphs, which are common in real world.
We propose a Quaternion-based Graph Convolution Network (QGCN) recommendation model.
arXiv Detail & Related papers (2021-11-20T07:42:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.