Instruct Large Language Models to Drive like Humans
- URL: http://arxiv.org/abs/2406.07296v1
- Date: Tue, 11 Jun 2024 14:24:45 GMT
- Title: Instruct Large Language Models to Drive like Humans
- Authors: Ruijun Zhang, Xianda Guo, Wenzhao Zheng, Chenming Zhang, Kurt Keutzer, Long Chen,
- Abstract summary: We propose an InstructDriver method to transform large language models into motion planners.
We derive driving instruction data based on human logic.
We then employ an interpretable InstructChain module to further reason the final planning.
- Score: 33.219883052634614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motion planning in complex scenarios is the core challenge in autonomous driving. Conventional methods apply predefined rules or learn from driving data to plan the future trajectory. Recent methods seek the knowledge preserved in large language models (LLMs) and apply them in the driving scenarios. Despite the promising results, it is still unclear whether the LLM learns the underlying human logic to drive. In this paper, we propose an InstructDriver method to transform LLM into a motion planner with explicit instruction tuning to align its behavior with humans. We derive driving instruction data based on human logic (e.g., do not cause collisions) and traffic rules (e.g., proceed only when green lights). We then employ an interpretable InstructChain module to further reason the final planning reflecting the instructions. Our InstructDriver allows the injection of human rules and learning from driving data, enabling both interpretability and data scalability. Different from existing methods that experimented on closed-loop or simulated settings, we adopt the real-world closed-loop motion planning nuPlan benchmark for better evaluation. InstructDriver demonstrates the effectiveness of the LLM planner in a real-world closed-loop setting. Our code is publicly available at https://github.com/bonbon-rj/InstructDriver.
Related papers
- LLM-Assist: Enhancing Closed-Loop Planning with Language-Based Reasoning [65.86754998249224]
We develop a novel hybrid planner that leverages a conventional rule-based planner in conjunction with an LLM-based planner.
Our approach navigates complex scenarios which existing planners struggle with, produces well-reasoned outputs while also remaining grounded through working alongside the rule-based approach.
arXiv Detail & Related papers (2023-12-30T02:53:45Z) - Personalized Autonomous Driving with Large Language Models: Field Experiments [11.429053835807697]
We introduce an LLM-based framework, Talk2Drive, capable of translating natural verbal commands into executable controls.
This is the first-of-its-kind multi-scenario field experiment that deploys LLMs on a real-world autonomous vehicle.
We validate that the proposed memory module considers personalized preferences and further reduces the takeover rate by up to 65.2%.
arXiv Detail & Related papers (2023-12-14T23:23:37Z) - DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral
Planning States for Autonomous Driving [69.82743399946371]
DriveMLM is a framework that can perform close-loop autonomous driving in realistic simulators.
We employ a multi-modal LLM (MLLM) to model the behavior planning module of a module AD system.
This model can plug-and-play in existing AD systems such as Apollo for close-loop driving.
arXiv Detail & Related papers (2023-12-14T18:59:05Z) - LMDrive: Closed-Loop End-to-End Driving with Large Language Models [37.910449013471656]
Large language models (LLM) have shown impressive reasoning capabilities that approach "Artificial General Intelligence"
This paper introduces LMDrive, a novel language-guided, end-to-end, closed-loop autonomous driving framework.
arXiv Detail & Related papers (2023-12-12T18:24:15Z) - Learning Realistic Traffic Agents in Closed-loop [36.38063449192355]
Reinforcement learning (RL) can train traffic agents to avoid infractions, but using RL alone results in unhuman-like driving behaviors.
We propose Reinforcing Traffic Rules (RTR) to match expert demonstrations under a traffic compliance constraint.
Our experiments show that RTR learns more realistic and generalizable traffic simulation policies.
arXiv Detail & Related papers (2023-11-02T16:55:23Z) - AutoPlan: Automatic Planning of Interactive Decision-Making Tasks With
Large Language Models [11.895111124804503]
AutoPlan is an approach to guide LLM-based agents to accomplish interactive decision-making tasks.
Our experiments show that AutoPlan achieves success rates on par with the baselines.
arXiv Detail & Related papers (2023-05-24T11:52:23Z) - Learning to drive from a world on rails [78.28647825246472]
We learn an interactive vision-based driving policy from pre-recorded driving logs via a model-based approach.
A forward model of the world supervises a driving policy that predicts the outcome of any potential driving trajectory.
Our method ranks first on the CARLA leaderboard, attaining a 25% higher driving score while using 40 times less data.
arXiv Detail & Related papers (2021-05-03T05:55:30Z) - Contingencies from Observations: Tractable Contingency Planning with
Learned Behavior Models [82.34305824719101]
Humans have a remarkable ability to make decisions by accurately reasoning about future events.
We develop a general-purpose contingency planner that is learned end-to-end using high-dimensional scene observations.
We show how this model can tractably learn contingencies from behavioral observations.
arXiv Detail & Related papers (2021-04-21T14:30:20Z) - Testing the Safety of Self-driving Vehicles by Simulating Perception and
Prediction [88.0416857308144]
We propose an alternative to sensor simulation, as sensor simulation is expensive and has large domain gaps.
We directly simulate the outputs of the self-driving vehicle's perception and prediction system, enabling realistic motion planning testing.
arXiv Detail & Related papers (2020-08-13T17:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.