Related papers: Driving Scene Synthesis on Free-form Trajectories with Generative Prior

Driving Scene Synthesis on Free-form Trajectories with Generative Prior

URL: http://arxiv.org/abs/2412.01717v1
Date: Mon, 02 Dec 2024 17:07:53 GMT
Title: Driving Scene Synthesis on Free-form Trajectories with Generative Prior
Authors: Zeyu Yang, Zijie Pan, Yuankun Yang, Xiatian Zhu, Li Zhang,
Abstract summary: We propose a novel free-form driving view synthesis approach, dubbed DriveX.<n>Our resulting model can produce high-fidelity virtual driving environments outside the recorded trajectory.<n>Beyond real driving scenes, DriveX can also be utilized to simulate virtual driving worlds from AI-generated videos.
Score: 39.24591650300784
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Driving scene synthesis along free-form trajectories is essential for driving simulations to enable closed-loop evaluation of end-to-end driving policies. While existing methods excel at novel view synthesis on recorded trajectories, they face challenges with novel trajectories due to limited views of driving videos and the vastness of driving environments. To tackle this challenge, we propose a novel free-form driving view synthesis approach, dubbed DriveX, by leveraging video generative prior to optimize a 3D model across a variety of trajectories. Concretely, we crafted an inverse problem that enables a video diffusion model to be utilized as a prior for many-trajectory optimization of a parametric 3D model (e.g., Gaussian splatting). To seamlessly use the generative prior, we iteratively conduct this process during optimization. Our resulting model can produce high-fidelity virtual driving environments outside the recorded trajectory, enabling free-form trajectory driving simulation. Beyond real driving scenes, DriveX can also be utilized to simulate virtual driving worlds from AI-generated videos.

Related papers

DreamDrive: Generative 4D Scene Modeling from Street View Images [55.45852373799639]
We present DreamDrive, a 4D spatial-temporal scene generation approach that combines the merits of generation and reconstruction. Specifically, we leverage the generative power of video diffusion models to synthesize a sequence of visual references. We then render 3D-consistent driving videos via Gaussian splatting.
arXiv Detail & Related papers (2024-12-31T18:59:57Z)
Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model [83.31688383891871]
We propose a Spatial-Temporal simulAtion for drivinG (Stag-1) model to reconstruct real-world scenes. Stag-1 constructs continuous 4D point cloud scenes using surround-view data from autonomous vehicles. It decouples spatial-temporal relationships and produces coherent driving videos.
arXiv Detail & Related papers (2024-12-06T18:59:56Z)
InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models [75.03495065452955]
We present InfiniCube, a scalable method for generating dynamic 3D driving scenes with high fidelity and controllability. Our method can generate controllable and realistic 3D driving scenes, and extensive experiments validate the effectiveness and superiority of our model.
arXiv Detail & Related papers (2024-12-05T07:32:20Z)
From Dashcam Videos to Driving Simulations: Stress Testing Automated Vehicles against Rare Events [5.132984904858975]
Testing Automated Driving Systems (ADS) in simulation with realistic driving scenarios is important for verifying their performance. We propose a novel framework that automates the conversion of real-world car crash videos into detailed simulation scenarios. Our preliminary results demonstrate substantial time efficiency, finishing the real-to-sim conversion in minutes with full automation and no human intervention.
arXiv Detail & Related papers (2024-11-25T01:01:54Z)
FreeVS: Generative View Synthesis on Free Driving Trajectory [55.49370963413221]
FreeVS is a novel fully generative approach that can synthesize camera views on free new trajectories in real driving scenes. FreeVS can be applied to any validation sequences without reconstruction process and synthesis views on novel trajectories.
arXiv Detail & Related papers (2024-10-23T17:59:11Z)
DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation [10.296670127024045]
DriveScape is an end-to-end framework for multi-view, 3D condition-guided video generation. Our Bi-Directional Modulated Transformer (BiMot) ensures precise alignment of 3D structural information. DriveScape excels in video generation performance, achieving state-of-the-art results on the nuScenes dataset with an FID score of 8.34 and an FVD score of 76.39.
arXiv Detail & Related papers (2024-09-09T09:43:17Z)
GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model [6.144680854063938]
GenDDS is a novel approach for generating driving scenarios for autonomous driving systems. We employ the KITTI dataset, which includes real-world driving videos, to train the model. We demonstrate that our model can generate high-quality driving videos that closely replicate the complexity and variability of real-world driving scenarios.
arXiv Detail & Related papers (2024-08-28T15:37:44Z)
AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction [17.600027937450342]
AutoSplat is a framework employing Gaussian splatting to achieve highly realistic reconstructions of autonomous driving scenes. Our method enables multi-view consistent simulation of challenging scenarios including lane changes.
arXiv Detail & Related papers (2024-07-02T18:36:50Z)
TC4D: Trajectory-Conditioned Text-to-4D Generation [94.90700997568158]
We propose TC4D: trajectory-conditioned text-to-4D generation, which factors motion into global and local components. We learn local deformations that conform to the global trajectory using supervision from a text-to-video model. Our approach enables the synthesis of scenes animated along arbitrary trajectories, compositional scene generation, and significant improvements to the realism and amount of generated motion.
arXiv Detail & Related papers (2024-03-26T17:55:11Z)
Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion [83.88829943619656]
We introduce a method for generating realistic pedestrian trajectories and full-body animations that can be controlled to meet user-defined goals. Our guided diffusion model allows users to constrain trajectories through target waypoints, speed, and specified social groups. We propose utilizing the value function learned during RL training of the animation controller to guide diffusion to produce trajectories better suited for particular scenarios.
arXiv Detail & Related papers (2023-04-04T15:46:42Z)
Path Planning Followed by Kinodynamic Smoothing for Multirotor Aerial Vehicles (MAVs) [61.94975011711275]
We propose a geometrically based motion planning technique textquotedblleft RRT*textquotedblright; for this purpose. In the proposed technique, we modified original RRT* introducing an adaptive search space and a steering function. We have tested the proposed technique in various simulated environments.
arXiv Detail & Related papers (2020-08-29T09:55:49Z)
LiDARsim: Realistic LiDAR Simulation by Leveraging the Real World [84.57894492587053]
We develop a novel simulator that captures both the power of physics-based and learning-based simulation. We first utilize ray casting over the 3D scene and then use a deep neural network to produce deviations from the physics-based simulation. We showcase LiDARsim's usefulness for perception algorithms-testing on long-tail events and end-to-end closed-loop evaluation on safety-critical scenarios.
arXiv Detail & Related papers (2020-06-16T17:44:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.