Related papers: Large Language Models Powered Context-aware Motion Prediction in Autonomous Driving

Large Language Models Powered Context-aware Motion Prediction in Autonomous Driving

URL: http://arxiv.org/abs/2403.11057v3
Date: Tue, 30 Jul 2024 02:35:52 GMT
Title: Large Language Models Powered Context-aware Motion Prediction in Autonomous Driving
Authors: Xiaoji Zheng, Lixiu Wu, Zhijie Yan, Yuanrong Tang, Hao Zhao, Chen Zhong, Bokui Chen, Jiangtao Gong,
Abstract summary: We utilize Large Language Models (LLMs) to enhance the global traffic context understanding for motion prediction tasks. Considering the cost associated with LLMs, we propose a cost-effective deployment strategy. Our research offers valuable insights into enhancing the understanding of traffic scenes of LLMs and the motion prediction performance of autonomous driving.
Score: 13.879945446114956
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Motion prediction is among the most fundamental tasks in autonomous driving. Traditional methods of motion forecasting primarily encode vector information of maps and historical trajectory data of traffic participants, lacking a comprehensive understanding of overall traffic semantics, which in turn affects the performance of prediction tasks. In this paper, we utilized Large Language Models (LLMs) to enhance the global traffic context understanding for motion prediction tasks. We first conducted systematic prompt engineering, visualizing complex traffic environments and historical trajectory information of traffic participants into image prompts -- Transportation Context Map (TC-Map), accompanied by corresponding text prompts. Through this approach, we obtained rich traffic context information from the LLM. By integrating this information into the motion prediction model, we demonstrate that such context can enhance the accuracy of motion predictions. Furthermore, considering the cost associated with LLMs, we propose a cost-effective deployment strategy: enhancing the accuracy of motion prediction tasks at scale with 0.7\% LLM-augmented datasets. Our research offers valuable insights into enhancing the understanding of traffic scenes of LLMs and the motion prediction performance of autonomous driving. The source code is available at \url{https://github.com/AIR-DISCOVER/LLM-Augmented-MTR} and \url{https://aistudio.baidu.com/projectdetail/7809548}.

Related papers

Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap [51.198001060683296]
Large Language Models (LLMs) offer transformative potential to address transportation challenges. This survey first presents LLM4TR, a novel conceptual framework that systematically categorizes the roles of LLMs in transportation. For each role, our review spans diverse applications, from traffic prediction and autonomous driving to safety analytics and urban mobility optimization.
arXiv Detail & Related papers (2025-03-27T11:56:27Z)
CoT-Drive: Efficient Motion Forecasting for Autonomous Driving with LLMs and Chain-of-Thought Prompting [14.567180355849501]
CoT-Drive is a novel approach that enhances motion forecasting by leveraging large language models (LLMs) and a chain-of-thought (CoT) prompting method. We introduce a teacher-student knowledge distillation strategy to effectively transfer LLMs' advanced scene understanding capabilities to lightweight language models (LMs) We present two new scene description datasets, Highway-Text and Urban-Text, designed for fine-tuning lightweight LMs to generate context-specific semantic annotations.
arXiv Detail & Related papers (2025-03-10T12:17:38Z)
Strada-LLM: Graph LLM for traffic prediction [62.2015839597764]
A considerable challenge in traffic prediction lies in handling the diverse data distributions caused by vastly different traffic conditions. We propose a graph-aware LLM for traffic prediction that considers proximal traffic information. We adopt a lightweight approach for efficient domain adaptation when facing new data distributions in few-shot fashion.
arXiv Detail & Related papers (2024-10-28T09:19:29Z)
iMotion-LLM: Motion Prediction Instruction Tuning [33.63656257401926]
We introduce iMotion-LLM: a Multimodal Large Language Models with trajectory prediction, tailored to guide interactive multi-agent scenarios. iMotion-LLM capitalizes on textual instructions as key inputs for generating contextually relevant trajectories. These findings act as milestones in empowering autonomous navigation systems to interpret and predict the dynamics of multi-agent environments.
arXiv Detail & Related papers (2024-06-10T12:22:06Z)
Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models [12.687494201105066]
This paper proposes Traj-LLM, the first to investigate the potential of using Large Language Models (LLMs) to generate future motion from agents' past/observed trajectories and scene semantics. LLMs' powerful comprehension abilities capture a spectrum of high-level scene knowledge and interactive information. Emulating the human-like lane focus cognitive function, we introduce lane-aware probabilistic learning powered by the pioneering Mamba module.
arXiv Detail & Related papers (2024-05-08T09:28:04Z)
Towards Explainable Traffic Flow Prediction with Large Language Models [36.86937188565623]
We propose a Traffic flow Prediction model based on Large Language Models (LLMs) to generate explainable traffic predictions. By transferring multi-modal traffic data into natural language descriptions, xTP-LLM captures complex time-series patterns and external factors from comprehensive traffic data. Empirically, xTP-LLM shows competitive accuracy compared with deep learning baselines, while providing an intuitive and reliable explanation for predictions.
arXiv Detail & Related papers (2024-04-03T07:14:15Z)
A Holistic Framework Towards Vision-based Traffic Signal Control with Microscopic Simulation [53.39174966020085]
Traffic signal control (TSC) is crucial for reducing traffic congestion that leads to smoother traffic flow, reduced idling time, and mitigated CO2 emissions. In this study, we explore the computer vision approach for TSC that modulates on-road traffic flows through visual observation. We introduce a holistic traffic simulation framework called TrafficDojo towards vision-based TSC and its benchmarking.
arXiv Detail & Related papers (2024-03-11T16:42:29Z)
TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models [27.306180426294784]
We introduce TPLLM, a novel traffic prediction framework leveraging Large Language Models (LLMs) In this framework, we construct a sequence embedding layer based on Conal Neural Networks (LoCNNs) and a graph embedding layer based on Graph Contemporalal Networks (GCNs) to extract sequence features and spatial features. Experiments on two real-world datasets demonstrate commendable performance in both full-sample and few-shot prediction scenarios.
arXiv Detail & Related papers (2024-03-04T17:08:57Z)
Pre-training on Synthetic Driving Data for Trajectory Prediction [61.520225216107306]
We propose a pipeline-level solution to mitigate the issue of data scarcity in trajectory forecasting. We adopt HD map augmentation and trajectory synthesis for generating driving data, and then we learn representations by pre-training on them. We conduct extensive experiments to demonstrate the effectiveness of our data expansion and pre-training strategies.
arXiv Detail & Related papers (2023-09-18T19:49:22Z)
TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction [149.5716746789134]
We show data-driven traffic simulation can be formulated as a world model. We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving. Experiments on the open motion dataset show TrafficBots can simulate realistic multi-agent behaviors.
arXiv Detail & Related papers (2023-03-07T18:28:41Z)
Motion Transformer with Global Intention Localization and Local Movement Refinement [103.75625476231401]
Motion TRansformer (MTR) models motion prediction as the joint optimization of global intention localization and local movement refinement. MTR achieves state-of-the-art performance on both the marginal and joint motion prediction challenges.
arXiv Detail & Related papers (2022-09-27T16:23:14Z)
Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data. We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.