GPT-Driver: Learning to Drive with GPT
- URL: http://arxiv.org/abs/2310.01415v3
- Date: Tue, 5 Dec 2023 05:26:29 GMT
- Title: GPT-Driver: Learning to Drive with GPT
- Authors: Jiageng Mao, Yuxi Qian, Junjie Ye, Hang Zhao, Yue Wang
- Abstract summary: We present a simple yet effective approach that can transform the OpenAI GPT-3.5 model into a reliable motion planner for autonomous vehicles.
We capitalize on the strong reasoning capabilities and generalization potential inherent to Large Language Models (LLMs)
We evaluate our approach on the large-scale nuScenes dataset, and extensive experiments substantiate the effectiveness, generalization ability, and interpretability of our GPT-based motion planner.
- Score: 47.14350537515685
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a simple yet effective approach that can transform the OpenAI
GPT-3.5 model into a reliable motion planner for autonomous vehicles. Motion
planning is a core challenge in autonomous driving, aiming to plan a driving
trajectory that is safe and comfortable. Existing motion planners predominantly
leverage heuristic methods to forecast driving trajectories, yet these
approaches demonstrate insufficient generalization capabilities in the face of
novel and unseen driving scenarios. In this paper, we propose a novel approach
to motion planning that capitalizes on the strong reasoning capabilities and
generalization potential inherent to Large Language Models (LLMs). The
fundamental insight of our approach is the reformulation of motion planning as
a language modeling problem, a perspective not previously explored.
Specifically, we represent the planner inputs and outputs as language tokens,
and leverage the LLM to generate driving trajectories through a language
description of coordinate positions. Furthermore, we propose a novel
prompting-reasoning-finetuning strategy to stimulate the numerical reasoning
potential of the LLM. With this strategy, the LLM can describe highly precise
trajectory coordinates and also its internal decision-making process in natural
language. We evaluate our approach on the large-scale nuScenes dataset, and
extensive experiments substantiate the effectiveness, generalization ability,
and interpretability of our GPT-based motion planner. Code is now available at
https://github.com/PointsCoder/GPT-Driver.
Related papers
- Hybrid Imitation-Learning Motion Planner for Urban Driving [0.0]
We propose a novel hybrid motion planner that integrates both learning-based and optimization-based techniques.
Our model effectively balances safety and human-likeness, mitigating the trade-off inherent in these objectives.
We validate our approach through simulation experiments and further demonstrate its efficacy by deploying it in real-world self-driving vehicles.
arXiv Detail & Related papers (2024-09-04T16:54:31Z) - Potential Based Diffusion Motion Planning [73.593988351275]
We propose a new approach towards learning potential based motion planning.
We train a neural network to capture and learn an easily optimizable potentials over motion planning trajectories.
We demonstrate its inherent composability, enabling us to generalize to a multitude of different motion constraints.
arXiv Detail & Related papers (2024-07-08T17:48:39Z) - iMotion-LLM: Motion Prediction Instruction Tuning [33.63656257401926]
We introduce iMotion-LLM: a Multimodal Large Language Models with trajectory prediction, tailored to guide interactive multi-agent scenarios.
iMotion-LLM capitalizes on textual instructions as key inputs for generating contextually relevant trajectories.
These findings act as milestones in empowering autonomous navigation systems to interpret and predict the dynamics of multi-agent environments.
arXiv Detail & Related papers (2024-06-10T12:22:06Z) - LLM-Assist: Enhancing Closed-Loop Planning with Language-Based Reasoning [65.86754998249224]
We develop a novel hybrid planner that leverages a conventional rule-based planner in conjunction with an LLM-based planner.
Our approach navigates complex scenarios which existing planners struggle with, produces well-reasoned outputs while also remaining grounded through working alongside the rule-based approach.
arXiv Detail & Related papers (2023-12-30T02:53:45Z) - Empowering Autonomous Driving with Large Language Models: A Safety Perspective [82.90376711290808]
This paper explores the integration of Large Language Models (LLMs) into Autonomous Driving systems.
LLMs are intelligent decision-makers in behavioral planning, augmented with a safety verifier shield for contextual safety learning.
We present two key studies in a simulated environment: an adaptive LLM-conditioned Model Predictive Control (MPC) and an LLM-enabled interactive behavior planning scheme with a state machine.
arXiv Detail & Related papers (2023-11-28T03:13:09Z) - Interpretable and Flexible Target-Conditioned Neural Planners For
Autonomous Vehicles [22.396215670672852]
Prior work only learns to estimate a single planning trajectory, while there may be multiple acceptable plans in real-world scenarios.
We propose an interpretable neural planner to regress a heatmap, which effectively represents multiple potential goals in the bird's-eye view of an autonomous vehicle.
Our systematic evaluation on the Lyft Open dataset shows that our model achieves a safer and more flexible driving performance than prior works.
arXiv Detail & Related papers (2023-09-23T22:13:03Z) - Integration of Reinforcement Learning Based Behavior Planning With
Sampling Based Motion Planning for Automated Driving [0.5801044612920815]
We propose a method to employ a trained deep reinforcement learning policy for dedicated high-level behavior planning.
To the best of our knowledge, this work is the first to apply deep reinforcement learning in this manner.
arXiv Detail & Related papers (2023-04-17T13:49:55Z) - End-to-end Interpretable Neural Motion Planner [78.69295676456085]
We propose a neural motion planner (NMP) for learning to drive autonomously in complex urban scenarios.
We design a holistic model that takes as input raw LIDAR data and a HD map and produces interpretable intermediate representations.
We demonstrate the effectiveness of our approach in real-world driving data captured in several cities in North America.
arXiv Detail & Related papers (2021-01-17T14:16:12Z) - The Importance of Prior Knowledge in Precise Multimodal Prediction [71.74884391209955]
Roads have well defined geometries, topologies, and traffic rules.
In this paper we propose to incorporate structured priors as a loss function.
We demonstrate the effectiveness of our approach on real-world self-driving datasets.
arXiv Detail & Related papers (2020-06-04T03:56:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.