LidarDM: Generative LiDAR Simulation in a Generated World
- URL: http://arxiv.org/abs/2404.02903v1
- Date: Wed, 3 Apr 2024 17:59:28 GMT
- Title: LidarDM: Generative LiDAR Simulation in a Generated World
- Authors: Vlas Zyrianov, Henry Che, Zhijian Liu, Shenlong Wang,
- Abstract summary: LidarDM is a novel LiDAR generative model capable of producing realistic, layout-aware, physically plausible, and temporally coherent LiDAR videos.
We employ latent diffusion models to generate the 3D scene, combine it with dynamic actors to form the underlying 4D world, and subsequently produce realistic sensory observations within this virtual environment.
Our experiments indicate that our approach outperforms competing algorithms in realism, temporal coherency, and layout consistency.
- Score: 21.343346521878864
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present LidarDM, a novel LiDAR generative model capable of producing realistic, layout-aware, physically plausible, and temporally coherent LiDAR videos. LidarDM stands out with two unprecedented capabilities in LiDAR generative modeling: (i) LiDAR generation guided by driving scenarios, offering significant potential for autonomous driving simulations, and (ii) 4D LiDAR point cloud generation, enabling the creation of realistic and temporally coherent sequences. At the heart of our model is a novel integrated 4D world generation framework. Specifically, we employ latent diffusion models to generate the 3D scene, combine it with dynamic actors to form the underlying 4D world, and subsequently produce realistic sensory observations within this virtual environment. Our experiments indicate that our approach outperforms competing algorithms in realism, temporal coherency, and layout consistency. We additionally show that LidarDM can be used as a generative world model simulator for training and testing perception models.
Related papers
- LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences [10.426609103049572]
LiDARCrafter is a unified framework for 4D LiDAR generation and editing.<n>It achieves state-of-the-art performance in fidelity, controllability, and temporal consistency across all levels.<n>The code and benchmark are released to the community.
arXiv Detail & Related papers (2025-08-05T17:59:56Z) - TopoLiDM: Topology-Aware LiDAR Diffusion Models for Interpretable and Realistic LiDAR Point Cloud Generation [15.223634903890863]
TopoLiDM is a novel framework that integrates graph neural networks with diffusion models under topological regularization for high-fidelity LiDAR generation.<n>Our approach first trains a topological-preserving VAE to extract latent graph representations by graph construction and multiple graph convolutional layers.<n>Extensive experiments on the KITTI-360 dataset demonstrate TopoLiDM's superiority over state-of-the-art methods.
arXiv Detail & Related papers (2025-07-30T08:02:42Z) - R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation [78.26308457952636]
This paper introduces R3D2, a lightweight, one-step diffusion model designed to overcome limitations in autonomous driving simulation.<n>It enables realistic insertion of complete 3D assets into existing scenes by generating plausible rendering effects-such as shadows and consistent lighting-in real time.<n>We show that R3D2 significantly enhances the realism of inserted assets, enabling use-cases like text-to-3D asset insertion and cross-scene/dataset object transfer.
arXiv Detail & Related papers (2025-06-09T14:50:19Z) - GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control [50.67481583744243]
We introduce GeoDrive, which explicitly integrates robust 3D geometry conditions into driving world models.<n>We propose a dynamic editing module during training to enhance the renderings by editing the positions of the vehicles.<n>Our method significantly outperforms existing models in both action accuracy and 3D spatial awareness.
arXiv Detail & Related papers (2025-05-28T14:46:51Z) - Learning 3D Persistent Embodied World Models [84.40585374179037]
We introduce a new persistent embodied world model with an explicit memory of previously generated content.<n>During generation time, our video diffusion model predicts RGB-D video of the future observations of the agent.<n>This generation is then aggregated into a persistent 3D map of the environment.
arXiv Detail & Related papers (2025-05-05T17:59:17Z) - Pre-Trained Video Generative Models as World Simulators [59.546627730477454]
We propose Dynamic World Simulation (DWS) to transform pre-trained video generative models into controllable world simulators.
To achieve precise alignment between conditioned actions and generated visual changes, we introduce a lightweight, universal action-conditioned module.
Experiments demonstrate that DWS can be versatilely applied to both diffusion and autoregressive transformer models.
arXiv Detail & Related papers (2025-02-10T14:49:09Z) - GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting [3.376357029373187]
GS-LiDAR is a novel framework for generating realistic LiDAR point clouds with panoramic Gaussian splatting.
We introduce a novel panoramic rendering technique with explicit ray-splat intersection, guided by panoramic LiDAR supervision.
arXiv Detail & Related papers (2025-01-22T11:21:20Z) - Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model [83.31688383891871]
We propose a Spatial-Temporal simulAtion for drivinG (Stag-1) model to reconstruct real-world scenes.
Stag-1 constructs continuous 4D point cloud scenes using surround-view data from autonomous vehicles.
It decouples spatial-temporal relationships and produces coherent driving videos.
arXiv Detail & Related papers (2024-12-06T18:59:56Z) - DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes [61.07023022220073]
We introduce DynamicCity, a novel 4D LiDAR generation framework capable of generating large-scale, high-quality LiDAR scenes.
In particular, DynamicCity employs a novel Projection Module to effectively compress 4D LiDAR features into six 2D feature maps for HexPlane construction.
In particular, a Padded Rollout Operation is proposed to reorganize all six feature planes of the HexPlane as a squared 2D feature map.
arXiv Detail & Related papers (2024-10-23T17:59:58Z) - LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [50.808933338389686]
LiDAR simulation plays a crucial role in closed-loop simulation for autonomous driving.
We present LiDAR-GS, the first LiDAR Gaussian Splatting method, for real-time high-fidelity re-simulation of LiDAR sensor scans in public urban road scenes.
Our approach succeeds in simultaneously re-simulating depth, intensity, and ray-drop channels, achieving state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z) - Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata [70.9375320609781]
We aim to generate fine-grained 3D geometry from large-scale sparse LiDAR scans, abundantly captured by autonomous vehicles (AV)
We propose hierarchical Generative Cellular Automata (hGCA), a spatially scalable 3D generative model, which grows geometry with local kernels following, in a coarse-to-fine manner, equipped with a light-weight planner to induce global consistency.
arXiv Detail & Related papers (2024-06-12T14:56:56Z) - OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving [62.54220021308464]
We propose a diffusion-based 4D occupancy generation model, OccSora, to simulate the development of the 3D world for autonomous driving.
OccSora can generate 16s-videos with authentic 3D layout and temporal consistency, demonstrating its ability to understand the spatial and temporal distributions of driving scenes.
arXiv Detail & Related papers (2024-05-30T17:59:42Z) - Towards Realistic Scene Generation with LiDAR Diffusion Models [15.487070964070165]
Diffusion models (DMs) excel in photo-realistic image synthesis, but their adaptation to LiDAR scene generation poses a substantial hurdle.
We propose LiDAR Diffusion Models (LiDMs) to generate LiDAR-realistic scenes from a latent space tailored to capture the realism of LiDAR scenes.
Specifically, we introduce curve-wise compression to simulate real-world LiDAR patterns, point-wise coordinate supervision to learn scene geometry, and patch-wise encoding for a full 3D object context.
arXiv Detail & Related papers (2024-03-31T22:18:56Z) - 3D-VLA: A 3D Vision-Language-Action Generative World Model [68.0388311799959]
Recent vision-language-action (VLA) models rely on 2D inputs, lacking integration with the broader realm of the 3D physical world.
We propose 3D-VLA by introducing a new family of embodied foundation models that seamlessly link 3D perception, reasoning, and action.
Our experiments on held-in datasets demonstrate that 3D-VLA significantly improves the reasoning, multimodal generation, and planning capabilities in embodied environments.
arXiv Detail & Related papers (2024-03-14T17:58:41Z) - A Unified Generative Framework for Realistic Lidar Simulation in Autonomous Driving Systems [10.036860459686526]
Lidar is a widely used sensor type among the perception sensors for Autonomous Driving Systems.
Deep generative models have emerged as promising solutions to synthesise realistic sensory data.
We propose a unified generative framework to enhance Lidar simulation fidelity.
arXiv Detail & Related papers (2023-12-25T21:55:00Z) - LiDAR Data Synthesis with Denoising Diffusion Probabilistic Models [1.1965844936801797]
Generative modeling of 3D LiDAR data is an emerging task with promising applications for autonomous mobile robots.
We present R2DM, a novel generative model for LiDAR data that can generate diverse and high-fidelity 3D scene point clouds.
Our method is built upon denoising diffusion probabilistic models (DDPMs), which have shown impressive results among generative model frameworks.
arXiv Detail & Related papers (2023-09-17T12:26:57Z) - NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance
Fields [20.887421720818892]
We present NeRF-LIDAR, a novel LiDAR simulation method that leverages real-world information to generate realistic LIDAR point clouds.
We verify the effectiveness of our NeRF-LiDAR by training different 3D segmentation models on the generated LiDAR point clouds.
arXiv Detail & Related papers (2023-04-28T12:41:28Z) - LiDARsim: Realistic LiDAR Simulation by Leveraging the Real World [84.57894492587053]
We develop a novel simulator that captures both the power of physics-based and learning-based simulation.
We first utilize ray casting over the 3D scene and then use a deep neural network to produce deviations from the physics-based simulation.
We showcase LiDARsim's usefulness for perception algorithms-testing on long-tail events and end-to-end closed-loop evaluation on safety-critical scenarios.
arXiv Detail & Related papers (2020-06-16T17:44:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.