LMDrive: Closed-Loop End-to-End Driving with Large Language Models
- URL: http://arxiv.org/abs/2312.07488v2
- Date: Thu, 21 Dec 2023 05:37:58 GMT
- Title: LMDrive: Closed-Loop End-to-End Driving with Large Language Models
- Authors: Hao Shao, Yuxuan Hu, Letian Wang, Steven L. Waslander, Yu Liu,
Hongsheng Li
- Abstract summary: Large language models (LLM) have shown impressive reasoning capabilities that approach "Artificial General Intelligence"
This paper introduces LMDrive, a novel language-guided, end-to-end, closed-loop autonomous driving framework.
- Score: 37.910449013471656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite significant recent progress in the field of autonomous driving,
modern methods still struggle and can incur serious accidents when encountering
long-tail unforeseen events and challenging urban scenarios. On the one hand,
large language models (LLM) have shown impressive reasoning capabilities that
approach "Artificial General Intelligence". On the other hand, previous
autonomous driving methods tend to rely on limited-format inputs (e.g. sensor
data and navigation waypoints), restricting the vehicle's ability to understand
language information and interact with humans. To this end, this paper
introduces LMDrive, a novel language-guided, end-to-end, closed-loop autonomous
driving framework. LMDrive uniquely processes and integrates multi-modal sensor
data with natural language instructions, enabling interaction with humans and
navigation software in realistic instructional settings. To facilitate further
research in language-based closed-loop autonomous driving, we also publicly
release the corresponding dataset which includes approximately 64K
instruction-following data clips, and the LangAuto benchmark that tests the
system's ability to handle complex instructions and challenging driving
scenarios. Extensive closed-loop experiments are conducted to demonstrate
LMDrive's effectiveness. To the best of our knowledge, we're the very first
work to leverage LLMs for closed-loop end-to-end autonomous driving. Codes,
models, and datasets can be found at https://github.com/opendilab/LMDrive
Related papers
- CoVLA: Comprehensive Vision-Language-Action Dataset for Autonomous Driving [1.727597257312416]
CoVLA (Comprehensive Vision-Language-Action) dataset comprises real-world driving videos spanning more than 80 hours.
This dataset establishes a framework for robust, interpretable, and data-driven autonomous driving systems.
arXiv Detail & Related papers (2024-08-19T09:53:49Z) - Instruct Large Language Models to Drive like Humans [33.219883052634614]
We propose an InstructDriver method to transform large language models into motion planners.
We derive driving instruction data based on human logic.
We then employ an interpretable InstructChain module to further reason the final planning.
arXiv Detail & Related papers (2024-06-11T14:24:45Z) - Hybrid Reasoning Based on Large Language Models for Autonomous Car Driving [14.64475022650084]
Large Language Models (LLMs) have garnered significant attention for their ability to understand text and images, generate human-like text, and perform complex reasoning tasks.
We investigate how well LLMs can adapt and apply a combination of arithmetic and common-sense reasoning, particularly in autonomous driving scenarios.
arXiv Detail & Related papers (2024-02-21T08:09:05Z) - Personalized Autonomous Driving with Large Language Models: Field Experiments [11.429053835807697]
We introduce an LLM-based framework, Talk2Drive, capable of translating natural verbal commands into executable controls.
This is the first-of-its-kind multi-scenario field experiment that deploys LLMs on a real-world autonomous vehicle.
We validate that the proposed memory module considers personalized preferences and further reduces the takeover rate by up to 65.2%.
arXiv Detail & Related papers (2023-12-14T23:23:37Z) - DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral
Planning States for Autonomous Driving [69.82743399946371]
DriveMLM is a framework that can perform close-loop autonomous driving in realistic simulators.
We employ a multi-modal LLM (MLLM) to model the behavior planning module of a module AD system.
This model can plug-and-play in existing AD systems such as Apollo for close-loop driving.
arXiv Detail & Related papers (2023-12-14T18:59:05Z) - LLM4Drive: A Survey of Large Language Models for Autonomous Driving [62.10344445241105]
Large language models (LLMs) have demonstrated abilities including understanding context, logical reasoning, and generating answers.
In this paper, we systematically review a research line about textitLarge Language Models for Autonomous Driving (LLM4AD).
arXiv Detail & Related papers (2023-11-02T07:23:33Z) - Drive Anywhere: Generalizable End-to-end Autonomous Driving with
Multi-modal Foundation Models [114.69732301904419]
We present an approach to apply end-to-end open-set (any environment/scene) autonomous driving that is capable of providing driving decisions from representations queryable by image and text.
Our approach demonstrates unparalleled results in diverse tests while achieving significantly greater robustness in out-of-distribution situations.
arXiv Detail & Related papers (2023-10-26T17:56:35Z) - LanguageMPC: Large Language Models as Decision Makers for Autonomous
Driving [87.1164964709168]
This work employs Large Language Models (LLMs) as a decision-making component for complex autonomous driving scenarios.
Extensive experiments demonstrate that our proposed method not only consistently surpasses baseline approaches in single-vehicle tasks, but also helps handle complex driving behaviors even multi-vehicle coordination.
arXiv Detail & Related papers (2023-10-04T17:59:49Z) - DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model [84.29836263441136]
This study introduces DriveGPT4, a novel interpretable end-to-end autonomous driving system based on multimodal large language models (MLLMs)
DriveGPT4 facilitates the interpretation of vehicle actions, offers pertinent reasoning, and effectively addresses a diverse range of questions posed by users.
Evaluations conducted on the BDD-X dataset showcase the superior qualitative and quantitative performance of DriveGPT4.
arXiv Detail & Related papers (2023-10-02T17:59:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.