GPT-Driver: Learning to Drive with GPT
- URL: http://arxiv.org/abs/2310.01415v3
- Date: Tue, 5 Dec 2023 05:26:29 GMT
- Title: GPT-Driver: Learning to Drive with GPT
- Authors: Jiageng Mao, Yuxi Qian, Junjie Ye, Hang Zhao, Yue Wang
- Abstract summary: We present a simple yet effective approach that can transform the OpenAI GPT-3.5 model into a reliable motion planner for autonomous vehicles.
We capitalize on the strong reasoning capabilities and generalization potential inherent to Large Language Models (LLMs)
We evaluate our approach on the large-scale nuScenes dataset, and extensive experiments substantiate the effectiveness, generalization ability, and interpretability of our GPT-based motion planner.
- Score: 47.14350537515685
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a simple yet effective approach that can transform the OpenAI
GPT-3.5 model into a reliable motion planner for autonomous vehicles. Motion
planning is a core challenge in autonomous driving, aiming to plan a driving
trajectory that is safe and comfortable. Existing motion planners predominantly
leverage heuristic methods to forecast driving trajectories, yet these
approaches demonstrate insufficient generalization capabilities in the face of
novel and unseen driving scenarios. In this paper, we propose a novel approach
to motion planning that capitalizes on the strong reasoning capabilities and
generalization potential inherent to Large Language Models (LLMs). The
fundamental insight of our approach is the reformulation of motion planning as
a language modeling problem, a perspective not previously explored.
Specifically, we represent the planner inputs and outputs as language tokens,
and leverage the LLM to generate driving trajectories through a language
description of coordinate positions. Furthermore, we propose a novel
prompting-reasoning-finetuning strategy to stimulate the numerical reasoning
potential of the LLM. With this strategy, the LLM can describe highly precise
trajectory coordinates and also its internal decision-making process in natural
language. We evaluate our approach on the large-scale nuScenes dataset, and
extensive experiments substantiate the effectiveness, generalization ability,
and interpretability of our GPT-based motion planner. Code is now available at
https://github.com/PointsCoder/GPT-Driver.
Related papers
- Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving [20.33096710167997]
generative planning with 3D-vision language pre-training model named GPVL is proposed for end-to-end autonomous driving.
Cross-modal language model is introduced to generate holistic driving decisions and fine-grained trajectories.
It is believed that the effective, robust and efficient performance of GPVL is crucial for the practical application of future autonomous driving systems.
arXiv Detail & Related papers (2025-01-15T15:20:46Z) - DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers [61.92571851411509]
We introduce a multimodal driving language based on interleaved image and action tokens, and develop DrivingGPT to learn joint world modeling and planning.
Our DrivingGPT demonstrates strong performance in both action-conditioned video generation and end-to-end planning, outperforming strong baselines on large-scale nuPlan and NAVSIM benchmarks.
arXiv Detail & Related papers (2024-12-24T18:59:37Z) - Hybrid Imitation-Learning Motion Planner for Urban Driving [0.0]
We propose a novel hybrid motion planner that integrates both learning-based and optimization-based techniques.
Our model effectively balances safety and human-likeness, mitigating the trade-off inherent in these objectives.
We validate our approach through simulation experiments and further demonstrate its efficacy by deploying it in real-world self-driving vehicles.
arXiv Detail & Related papers (2024-09-04T16:54:31Z) - Potential Based Diffusion Motion Planning [73.593988351275]
We propose a new approach towards learning potential based motion planning.
We train a neural network to capture and learn an easily optimizable potentials over motion planning trajectories.
We demonstrate its inherent composability, enabling us to generalize to a multitude of different motion constraints.
arXiv Detail & Related papers (2024-07-08T17:48:39Z) - LLM-Assist: Enhancing Closed-Loop Planning with Language-Based Reasoning [65.86754998249224]
We develop a novel hybrid planner that leverages a conventional rule-based planner in conjunction with an LLM-based planner.
Our approach navigates complex scenarios which existing planners struggle with, produces well-reasoned outputs while also remaining grounded through working alongside the rule-based approach.
arXiv Detail & Related papers (2023-12-30T02:53:45Z) - Interpretable and Flexible Target-Conditioned Neural Planners For
Autonomous Vehicles [22.396215670672852]
Prior work only learns to estimate a single planning trajectory, while there may be multiple acceptable plans in real-world scenarios.
We propose an interpretable neural planner to regress a heatmap, which effectively represents multiple potential goals in the bird's-eye view of an autonomous vehicle.
Our systematic evaluation on the Lyft Open dataset shows that our model achieves a safer and more flexible driving performance than prior works.
arXiv Detail & Related papers (2023-09-23T22:13:03Z) - Integration of Reinforcement Learning Based Behavior Planning With
Sampling Based Motion Planning for Automated Driving [0.5801044612920815]
We propose a method to employ a trained deep reinforcement learning policy for dedicated high-level behavior planning.
To the best of our knowledge, this work is the first to apply deep reinforcement learning in this manner.
arXiv Detail & Related papers (2023-04-17T13:49:55Z) - End-to-end Interpretable Neural Motion Planner [78.69295676456085]
We propose a neural motion planner (NMP) for learning to drive autonomously in complex urban scenarios.
We design a holistic model that takes as input raw LIDAR data and a HD map and produces interpretable intermediate representations.
We demonstrate the effectiveness of our approach in real-world driving data captured in several cities in North America.
arXiv Detail & Related papers (2021-01-17T14:16:12Z) - The Importance of Prior Knowledge in Precise Multimodal Prediction [71.74884391209955]
Roads have well defined geometries, topologies, and traffic rules.
In this paper we propose to incorporate structured priors as a loss function.
We demonstrate the effectiveness of our approach on real-world self-driving datasets.
arXiv Detail & Related papers (2020-06-04T03:56:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.