MADrive: Memory-Augmented Driving Scene Modeling
- URL: http://arxiv.org/abs/2506.21520v1
- Date: Thu, 26 Jun 2025 17:41:07 GMT
- Title: MADrive: Memory-Augmented Driving Scene Modeling
- Authors: Polina Karpikova, Daniil Selikhanovych, Kirill Struminsky, Ruslan Musaev, Maria Golitsyna, Dmitry Baranchuk,
- Abstract summary: MADrive is a memory-augmented reconstruction framework designed to extend the capabilities of existing scene reconstruction methods.<n>It replaces observed vehicles with visually similar 3D assets retrieved from a large-scale external memory bank.<n>The resulting replacements provide complete multi-view representations of vehicles in the scene, enabling photorealistic synthesis of substantially altered configurations.
- Score: 8.604680698214196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in scene reconstruction have pushed toward highly realistic modeling of autonomous driving (AD) environments using 3D Gaussian splatting. However, the resulting reconstructions remain closely tied to the original observations and struggle to support photorealistic synthesis of significantly altered or novel driving scenarios. This work introduces MADrive, a memory-augmented reconstruction framework designed to extend the capabilities of existing scene reconstruction methods by replacing observed vehicles with visually similar 3D assets retrieved from a large-scale external memory bank. Specifically, we release MAD-Cars, a curated dataset of ${\sim}70$K 360{\deg} car videos captured in the wild and present a retrieval module that finds the most similar car instances in the memory bank, reconstructs the corresponding 3D assets from video, and integrates them into the target scene through orientation alignment and relighting. The resulting replacements provide complete multi-view representations of vehicles in the scene, enabling photorealistic synthesis of substantially altered configurations, as demonstrated in our experiments. Project page: https://yandex-research.github.io/madrive/
Related papers
- SCPainter: A Unified Framework for Realistic 3D Asset Insertion and Novel View Synthesis [3.614325475261039]
3D asset insertion and novel view synthesis (NVS) are key components for autonomous driving simulation, enhancing the diversity of training data.<n>We present SCPainter, a unified framework which integrates 3D Splat (GS) car asset representations and 3D scene point clouds with diffusion-based generation.<n>The 3D GS assets and 3D scene point clouds are projected together into novel views, and these projections are used to condition a diffusion model to generate high quality images.
arXiv Detail & Related papers (2025-12-27T21:28:48Z) - InstDrive: Instance-Aware 3D Gaussian Splatting for Driving Scenes [30.149975412543444]
In this paper, we present InstDrive, an instance-aware 3D Gaussian Splatting framework tailored for the interactive reconstruction of dynamic driving scene.<n>We use masks generated by SAM as pseudo ground-truth to guide 2D feature learning via contrastive loss and pseudo-supervised objectives.<n>At the 3D level, we introduce regularization to implicitly encode instance identities and enforce consistency through a voxel-based loss.
arXiv Detail & Related papers (2025-08-16T11:17:31Z) - LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans [64.31686158593351]
LiteReality is a novel pipeline that converts RGB-D scans of indoor environments into compact, realistic, and interactive 3D virtual replicas.<n> LiteReality supports key features essential for graphics pipelines -- such as object individuality, articulation, high-quality rendering materials, and physically based interaction.<n>We demonstrate the effectiveness of LiteReality on both real-life scans and public datasets.
arXiv Detail & Related papers (2025-07-03T17:59:55Z) - DreamDrive: Generative 4D Scene Modeling from Street View Images [55.45852373799639]
We present DreamDrive, a 4D spatial-temporal scene generation approach that combines the merits of generation and reconstruction.<n>Specifically, we leverage the generative power of video diffusion models to synthesize a sequence of visual references.<n>We then render 3D-consistent driving videos via Gaussian splatting.
arXiv Detail & Related papers (2024-12-31T18:59:57Z) - DrivingRecon: Large 4D Gaussian Reconstruction Model For Autonomous Driving [83.27075316161086]
Photorealistic 4D reconstruction of street scenes is essential for developing real-world simulators in autonomous driving.<n>We introduce the Large 4D Gaussian Reconstruction Model (DrivingRecon), a generalizable driving scene reconstruction model.<n>We show that DrivingRecon significantly improves scene reconstruction quality and novel view synthesis compared to existing methods.
arXiv Detail & Related papers (2024-12-12T08:10:31Z) - MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes [72.02827211293736]
We introduce MagicDrive3D, a novel pipeline for controllable 3D street scene generation.
Unlike previous methods that reconstruct before training the generative models, MagicDrive3D first trains a video generation model and then reconstructs from the generated data.
Our results demonstrate the framework's superior performance, showcasing its potential for autonomous driving simulation and beyond.
arXiv Detail & Related papers (2024-05-23T12:04:51Z) - DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes [57.12439406121721]
We present DrivingGaussian, an efficient and effective framework for surrounding dynamic autonomous driving scenes.
For complex scenes with moving objects, we first sequentially and progressively model the static background of the entire scene.
We then leverage a composite dynamic Gaussian graph to handle multiple moving objects.
We further use a LiDAR prior for Gaussian Splatting to reconstruct scenes with greater details and maintain panoramic consistency.
arXiv Detail & Related papers (2023-12-13T06:30:51Z) - AutoRecon: Automated 3D Object Discovery and Reconstruction [41.60050228813979]
We propose a novel framework named AutoRecon for the automated discovery and reconstruction of an object from multi-view images.
We demonstrate that foreground objects can be robustly located and segmented from SfM point clouds by leveraging self-supervised 2D vision transformer features.
Experiments on the DTU, BlendedMVS and CO3D-V2 datasets demonstrate the effectiveness and robustness of AutoRecon.
arXiv Detail & Related papers (2023-05-15T17:16:46Z) - READ: Large-Scale Neural Scene Rendering for Autonomous Driving [21.144110676687667]
A large-scale neural rendering method is proposed to synthesize the autonomous driving scene.
Our model can not only synthesize realistic driving scenes but also stitch and edit driving scenes.
arXiv Detail & Related papers (2022-05-11T14:02:14Z) - Recovering and Simulating Pedestrians in the Wild [81.38135735146015]
We propose to recover the shape and motion of pedestrians from sensor readings captured in the wild by a self-driving car driving around.
We incorporate the reconstructed pedestrian assets bank in a realistic 3D simulation system.
We show that the simulated LiDAR data can be used to significantly reduce the amount of real-world data required for visual perception tasks.
arXiv Detail & Related papers (2020-11-16T17:16:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.