Related papers: The Role of World Models in Shaping Autonomous Driving: A Comprehensive Survey

The Role of World Models in Shaping Autonomous Driving: A Comprehensive Survey

URL: http://arxiv.org/abs/2502.10498v1
Date: Fri, 14 Feb 2025 18:43:15 GMT
Title: The Role of World Models in Shaping Autonomous Driving: A Comprehensive Survey
Authors: Sifan Tu, Xin Zhou, Dingkang Liang, Xingyu Jiang, Yumeng Zhang, Xiaofan Li, Xiang Bai,
Abstract summary: Driving World Model (DWM) focuses on predicting scene evolution during the driving process.<n>DWM methods enable autonomous driving systems to better perceive, understand, and interact with dynamic driving environments.
Score: 50.62538723793247
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Driving World Model (DWM), which focuses on predicting scene evolution during the driving process, has emerged as a promising paradigm in pursuing autonomous driving. These methods enable autonomous driving systems to better perceive, understand, and interact with dynamic driving environments. In this survey, we provide a comprehensive overview of the latest progress in DWM. We categorize existing approaches based on the modalities of the predicted scenes and summarize their specific contributions to autonomous driving. In addition, high-impact datasets and various metrics tailored to different tasks within the scope of DWM research are reviewed. Finally, we discuss the potential limitations of current research and propose future directions. This survey provides valuable insights into the development and application of DWM, fostering its broader adoption in autonomous driving. The relevant papers are collected at https://github.com/LMD0311/Awesome-World-Model.

Related papers

A Survey of World Models for Autonomous Driving [63.33363128964687]
Recent breakthroughs in autonomous driving have been propelled by advances in robust world modeling.<n>This paper systematically reviews recent advances in world models for autonomous driving.
arXiv Detail & Related papers (2025-01-20T04:00:02Z)
DriveMM: All-in-One Large Multimodal Model for Autonomous Driving [63.882827922267666]
DriveMM is a large multimodal model designed to process diverse data inputs, such as images and multi-view videos, while performing a broad spectrum of autonomous driving tasks. We conduct evaluations on six public benchmarks and undertake zero-shot transfer on an unseen dataset, where DriveMM achieves state-of-the-art performance across all tasks.
arXiv Detail & Related papers (2024-12-10T17:27:32Z)
Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey [61.39993881402787]
World models and video generation are pivotal technologies in the domain of autonomous driving. This paper investigates the relationship between these two technologies. By analyzing the interplay between video generation and world models, this survey identifies critical challenges and future research directions.
arXiv Detail & Related papers (2024-11-05T08:58:35Z)
DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model [65.43473733967038]
We introduce DrivingDojo, the first dataset tailor-made for training interactive world models with complex driving dynamics. Our dataset features video clips with a complete set of driving maneuvers, diverse multi-agent interplay, and rich open-world driving knowledge.
arXiv Detail & Related papers (2024-10-14T17:19:23Z)
Probing Multimodal LLMs as World Models for Driving [72.18727651074563]
We look at the application of Multimodal Large Language Models (MLLMs) in autonomous driving. Despite advances in models like GPT-4o, their performance in complex driving environments remains largely unexplored.
arXiv Detail & Related papers (2024-05-09T17:52:42Z)
World Models for Autonomous Driving: An Initial Survey [16.448614804069674]
The capability to accurately predict future events and assess their implications is paramount for both safety and efficiency. World models have emerged as a transformative approach, enabling autonomous driving systems to synthesize and interpret vast amounts of sensor data. This paper provides an initial review of the current state and prospective advancements of world models in autonomous driving.
arXiv Detail & Related papers (2024-03-05T03:23:55Z)
Beyond One Model Fits All: Ensemble Deep Learning for Autonomous Vehicles [16.398646583844286]
This study introduces three distinct neural network models corresponding to Mediated Perception, Behavior Reflex, and Direct Perception approaches. Our architecture fuses information from the base, future latent vector prediction, and auxiliary task networks, using global routing commands to select appropriate action sub-networks.
arXiv Detail & Related papers (2023-12-10T04:40:02Z)
Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving [56.381918362410175]
Drive-WM is the first driving world model compatible with existing end-to-end planning models. Our model generates high-fidelity multiview videos in driving scenes.
arXiv Detail & Related papers (2023-11-29T18:59:47Z)
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving [37.617793990547625]
This report provides an exhaustive evaluation of the latest state-of-the-art VLM, GPT-4V. We explore the model's abilities to understand and reason about driving scenes, make decisions, and ultimately act in the capacity of a driver. Our findings reveal that GPT-4V demonstrates superior performance in scene understanding and causal reasoning compared to existing autonomous systems.
arXiv Detail & Related papers (2023-11-09T12:58:37Z)
LLM4Drive: A Survey of Large Language Models for Autonomous Driving [62.10344445241105]
Large language models (LLMs) have demonstrated abilities including understanding context, logical reasoning, and generating answers. In this paper, we systematically review a research line about textitLarge Language Models for Autonomous Driving (LLM4AD).
arXiv Detail & Related papers (2023-11-02T07:23:33Z)
Vision Language Models in Autonomous Driving: A Survey and Outlook [26.70381732289961]
Vision-Language Models (VLMs) have attracted widespread attention due to their outstanding performance and the ability to leverage Large Language Models (LLMs) We present a comprehensive and systematic survey of the advances in vision language models in this domain, encompassing perception and understanding, navigation and planning, decision-making and control, end-to-end autonomous driving, and data generation.
arXiv Detail & Related papers (2023-10-22T21:06:10Z)
The Integration of Prediction and Planning in Deep Learning Automated Driving Systems: A Review [43.30610493968783]
We review state-of-the-art deep learning-based planning systems, and focus on how they integrate prediction. We discuss the implications, strengths, and limitations of different integration principles.
arXiv Detail & Related papers (2023-08-10T17:53:03Z)
End-to-end Autonomous Driving: Challenges and Frontiers [45.391430626264764]
We provide a comprehensive analysis of more than 270 papers, covering the motivation, roadmap, methodology, challenges, and future trends in end-to-end autonomous driving. We delve into several critical challenges, including multi-modality, interpretability, causal confusion, robustness, and world models, amongst others. We discuss current advancements in foundation models and visual pre-training, as well as how to incorporate these techniques within the end-to-end driving framework.
arXiv Detail & Related papers (2023-06-29T14:17:24Z)
Fully End-to-end Autonomous Driving with Semantic Depth Cloud Mapping and Multi-Agent [2.512827436728378]
We propose a novel deep learning model trained with end-to-end and multi-task learning manners to perform both perception and control tasks simultaneously. The model is evaluated on CARLA simulator with various scenarios made of normal-adversarial situations and different weathers to mimic real-world conditions.
arXiv Detail & Related papers (2022-04-12T03:57:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.