Applications of Large Scale Foundation Models for Autonomous Driving
- URL: http://arxiv.org/abs/2311.12144v7
- Date: Fri, 5 Jan 2024 01:58:58 GMT
- Title: Applications of Large Scale Foundation Models for Autonomous Driving
- Authors: Yu Huang, Yue Chen, Zhu Li
- Abstract summary: Large language models (LLMs) and chat systems, such as chatGPT and PaLM, emerge and rapidly become a promising direction to achieve artificial general intelligence (AGI) in natural language processing (NLP)
In this paper, we investigate the techniques of foundation models and LLMs applied for autonomous driving, categorized as simulation, world model, data annotation and planning or E2E solutions etc.
- Score: 22.651585322658686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Since DARPA Grand Challenges (rural) in 2004/05 and Urban Challenges in 2007,
autonomous driving has been the most active field of AI applications. Recently
powered by large language models (LLMs), chat systems, such as chatGPT and
PaLM, emerge and rapidly become a promising direction to achieve artificial
general intelligence (AGI) in natural language processing (NLP). There comes a
natural thinking that we could employ these abilities to reformulate autonomous
driving. By combining LLM with foundation models, it is possible to utilize the
human knowledge, commonsense and reasoning to rebuild autonomous driving
systems from the current long-tailed AI dilemma. In this paper, we investigate
the techniques of foundation models and LLMs applied for autonomous driving,
categorized as simulation, world model, data annotation and planning or E2E
solutions etc.
Related papers
- Work-in-Progress: Crash Course: Can (Under Attack) Autonomous Driving Beat Human Drivers? [60.51287814584477]
This paper evaluates the inherent risks in autonomous driving by examining the current landscape of AVs.
We develop specific claims highlighting the delicate balance between the advantages of AVs and potential security challenges in real-world scenarios.
arXiv Detail & Related papers (2024-05-14T09:42:21Z) - AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents [109.3804962220498]
AutoRT is a system to scale up the deployment of operational robots in completely unseen scenarios with minimal human supervision.
We demonstrate AutoRT proposing instructions to over 20 robots across multiple buildings and collecting 77k real robot episodes via both teleoperation and autonomous robot policies.
We experimentally show that such "in-the-wild" data collected by AutoRT is significantly more diverse, and that AutoRT's use of LLMs allows for instruction following data collection robots that can align to human preferences.
arXiv Detail & Related papers (2024-01-23T18:45:54Z) - Forging Vision Foundation Models for Autonomous Driving: Challenges,
Methodologies, and Opportunities [59.02391344178202]
Vision foundation models (VFMs) serve as potent building blocks for a wide range of AI applications.
The scarcity of comprehensive training data, the need for multi-sensor integration, and the diverse task-specific architectures pose significant obstacles to the development of VFMs.
This paper delves into the critical challenge of forging VFMs tailored specifically for autonomous driving, while also outlining future directions.
arXiv Detail & Related papers (2024-01-16T01:57:24Z) - DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral
Planning States for Autonomous Driving [69.82743399946371]
DriveMLM is a framework that can perform close-loop autonomous driving in realistic simulators.
We employ a multi-modal LLM (MLLM) to model the behavior planning module of a module AD system.
This model can plug-and-play in existing AD systems such as Apollo for close-loop driving.
arXiv Detail & Related papers (2023-12-14T18:59:05Z) - A Survey on Multimodal Large Language Models for Autonomous Driving [31.614730391949657]
Multimodal AI systems benefiting from large models have the potential to equally perceive the real world, make decisions, and control tools as humans.
Despite its immense potential, there is still a lack of a comprehensive understanding of key challenges, opportunities, and future endeavors to apply in Multimodal Large Language Models driving systems.
arXiv Detail & Related papers (2023-11-21T03:32:01Z) - A Language Agent for Autonomous Driving [31.359413767191608]
We propose a paradigm shift to integrate human-like intelligence into autonomous driving systems.
Our approach, termed Agent-Driver, transforms the traditional autonomous driving pipeline by introducing a versatile tool library.
Powered by Large Language Models (LLMs), our Agent-Driver is endowed with intuitive common sense and robust reasoning capabilities.
arXiv Detail & Related papers (2023-11-17T18:59:56Z) - LLM4Drive: A Survey of Large Language Models for Autonomous Driving [62.10344445241105]
Large language models (LLMs) have demonstrated abilities including understanding context, logical reasoning, and generating answers.
In this paper, we systematically review a research line about textitLarge Language Models for Autonomous Driving (LLM4AD).
arXiv Detail & Related papers (2023-11-02T07:23:33Z) - DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large
Language Models [30.23228092898916]
We propose the DiLu framework, which combines a Reasoning and a Reflection module to enable the system to perform decision-making based on common-sense knowledge.
Extensive experiments prove DiLu's capability to accumulate experience and demonstrate a significant advantage in generalization ability.
To the best of our knowledge, we are the first to leverage knowledge-driven capability in decision-making for autonomous vehicles.
arXiv Detail & Related papers (2023-09-28T09:41:35Z) - Drive as You Speak: Enabling Human-Like Interaction with Large Language
Models in Autonomous Vehicles [13.102404404559428]
We present a novel framework that leverages Large Language Models (LLMs) to enhance autonomous vehicles' decision-making processes.
The proposed framework holds the potential to revolutionize the way autonomous vehicles operate, offering personalized assistance, continuous learning, and transparent decision-making.
arXiv Detail & Related papers (2023-09-19T00:47:13Z) - Generative AI-empowered Simulation for Autonomous Driving in Vehicular
Mixed Reality Metaverses [130.15554653948897]
In vehicular mixed reality (MR) Metaverse, distance between physical and virtual entities can be overcome.
Large-scale traffic and driving simulation via realistic data collection and fusion from the physical world is difficult and costly.
We propose an autonomous driving architecture, where generative AI is leveraged to synthesize unlimited conditioned traffic and driving data in simulations.
arXiv Detail & Related papers (2023-02-16T16:54:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.