Driving into the Future: Multiview Visual Forecasting and Planning with
World Model for Autonomous Driving
- URL: http://arxiv.org/abs/2311.17918v1
- Date: Wed, 29 Nov 2023 18:59:47 GMT
- Title: Driving into the Future: Multiview Visual Forecasting and Planning with
World Model for Autonomous Driving
- Authors: Yuqi Wang, Jiawei He, Lue Fan, Hongxin Li, Yuntao Chen, Zhaoxiang
Zhang
- Abstract summary: Drive-WM is the first driving world model compatible with existing end-to-end planning models.
Our model generates high-fidelity multiview videos in driving scenes.
- Score: 56.381918362410175
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In autonomous driving, predicting future events in advance and evaluating the
foreseeable risks empowers autonomous vehicles to better plan their actions,
enhancing safety and efficiency on the road. To this end, we propose Drive-WM,
the first driving world model compatible with existing end-to-end planning
models. Through a joint spatial-temporal modeling facilitated by view
factorization, our model generates high-fidelity multiview videos in driving
scenes. Building on its powerful generation ability, we showcase the potential
of applying the world model for safe driving planning for the first time.
Particularly, our Drive-WM enables driving into multiple futures based on
distinct driving maneuvers, and determines the optimal trajectory according to
the image-based rewards. Evaluation on real-world driving datasets verifies
that our method could generate high-quality, consistent, and controllable
multiview videos, opening up possibilities for real-world simulations and safe
planning.
Related papers
- Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey [61.39993881402787]
World models and video generation are pivotal technologies in the domain of autonomous driving.
This paper investigates the relationship between these two technologies.
By analyzing the interplay between video generation and world models, this survey identifies critical challenges and future research directions.
arXiv Detail & Related papers (2024-11-05T08:58:35Z) - DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model [65.43473733967038]
We introduce DrivingDojo, the first dataset tailor-made for training interactive world models with complex driving dynamics.
Our dataset features video clips with a complete set of driving maneuvers, diverse multi-agent interplay, and rich open-world driving knowledge.
arXiv Detail & Related papers (2024-10-14T17:19:23Z) - Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving [15.100104512786107]
Drive-OccWorld adapts a visioncentric- 4D forecasting world model to end-to-end planning for autonomous driving.
We propose injecting flexible action conditions, such as velocity, steering angle, trajectory, and commands, into the world model.
Experiments on the nuScenes dataset demonstrate that our method can generate plausible and controllable 4D occupancy.
arXiv Detail & Related papers (2024-08-26T11:53:09Z) - GenAD: Generalized Predictive Model for Autonomous Driving [75.39517472462089]
We introduce the first large-scale video prediction model in the autonomous driving discipline.
Our model, dubbed GenAD, handles the challenging dynamics in driving scenes with novel temporal reasoning blocks.
It can be adapted into an action-conditioned prediction model or a motion planner, holding great potential for real-world driving applications.
arXiv Detail & Related papers (2024-03-14T17:58:33Z) - GAIA-1: A Generative World Model for Autonomous Driving [9.578453700755318]
We introduce GAIA-1 ('Generative AI for Autonomy'), a generative world model that generates realistic driving scenarios.
Emerging properties from our model include learning high-level structures and scene dynamics, contextual awareness, generalization, and understanding of geometry.
arXiv Detail & Related papers (2023-09-29T09:20:37Z) - Interpretable and Flexible Target-Conditioned Neural Planners For
Autonomous Vehicles [22.396215670672852]
Prior work only learns to estimate a single planning trajectory, while there may be multiple acceptable plans in real-world scenarios.
We propose an interpretable neural planner to regress a heatmap, which effectively represents multiple potential goals in the bird's-eye view of an autonomous vehicle.
Our systematic evaluation on the Lyft Open dataset shows that our model achieves a safer and more flexible driving performance than prior works.
arXiv Detail & Related papers (2023-09-23T22:13:03Z) - End-to-end Interpretable Neural Motion Planner [78.69295676456085]
We propose a neural motion planner (NMP) for learning to drive autonomously in complex urban scenarios.
We design a holistic model that takes as input raw LIDAR data and a HD map and produces interpretable intermediate representations.
We demonstrate the effectiveness of our approach in real-world driving data captured in several cities in North America.
arXiv Detail & Related papers (2021-01-17T14:16:12Z) - LookOut: Diverse Multi-Future Prediction and Planning for Self-Driving [139.33800431159446]
LookOut is an approach to jointly perceive the environment and predict a diverse set of futures from sensor data.
We show that our model demonstrates significantly more diverse and sample-efficient motion forecasting in a large-scale self-driving dataset.
arXiv Detail & Related papers (2021-01-16T23:19:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.