Large Language Models Powered Context-aware Motion Prediction
- URL: http://arxiv.org/abs/2403.11057v2
- Date: Mon, 22 Jul 2024 02:11:29 GMT
- Title: Large Language Models Powered Context-aware Motion Prediction
- Authors: Xiaoji Zheng, Lixiu Wu, Zhijie Yan, Yuanrong Tang, Hao Zhao, Chen Zhong, Bokui Chen, Jiangtao Gong,
- Abstract summary: We utilize Large Language Models (LLMs) to enhance the global traffic context understanding for motion prediction tasks.
Considering the cost associated with LLMs, we propose a cost-effective deployment strategy.
Our research offers valuable insights into enhancing the understanding of traffic scenes of LLMs and the motion prediction performance of autonomous driving.
- Score: 13.879945446114956
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Motion prediction is among the most fundamental tasks in autonomous driving. Traditional methods of motion forecasting primarily encode vector information of maps and historical trajectory data of traffic participants, lacking a comprehensive understanding of overall traffic semantics, which in turn affects the performance of prediction tasks. In this paper, we utilized Large Language Models (LLMs) to enhance the global traffic context understanding for motion prediction tasks. We first conducted systematic prompt engineering, visualizing complex traffic environments and historical trajectory information of traffic participants into image prompts -- Transportation Context Map (TC-Map), accompanied by corresponding text prompts. Through this approach, we obtained rich traffic context information from the LLM. By integrating this information into the motion prediction model, we demonstrate that such context can enhance the accuracy of motion predictions. Furthermore, considering the cost associated with LLMs, we propose a cost-effective deployment strategy: enhancing the accuracy of motion prediction tasks at scale with 0.7\% LLM-augmented datasets. Our research offers valuable insights into enhancing the understanding of traffic scenes of LLMs and the motion prediction performance of autonomous driving. The source code is available at \url{https://github.com/AIR-DISCOVER/LLM-Augmented-MTR} and \url{https://aistudio.baidu.com/projectdetail/7809548}.
Related papers
- iMotion-LLM: Motion Prediction Instruction Tuning [33.63656257401926]
We introduce iMotion-LLM: a Multimodal Large Language Models with trajectory prediction, tailored to guide interactive multi-agent scenarios.
iMotion-LLM capitalizes on textual instructions as key inputs for generating contextually relevant trajectories.
These findings act as milestones in empowering autonomous navigation systems to interpret and predict the dynamics of multi-agent environments.
arXiv Detail & Related papers (2024-06-10T12:22:06Z) - Probing Multimodal LLMs as World Models for Driving [72.18727651074563]
This study focuses on the application of Multimodal Large Language Models (MLLMs) within the domain of autonomous driving.
We evaluate the capability of various MLLMs as world models for driving from the perspective of a fixed in-car camera.
Our results highlight a critical gap in the current capabilities of state-of-the-art MLLMs.
arXiv Detail & Related papers (2024-05-09T17:52:42Z) - Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models [12.687494201105066]
This paper proposes Traj-LLM, the first to investigate the potential of using Large Language Models (LLMs) to generate future motion from agents' past/observed trajectories and scene semantics.
LLMs' powerful comprehension abilities capture a spectrum of high-level scene knowledge and interactive information.
Emulating the human-like lane focus cognitive function, we introduce lane-aware probabilistic learning powered by the pioneering Mamba module.
arXiv Detail & Related papers (2024-05-08T09:28:04Z) - Towards Responsible and Reliable Traffic Flow Prediction with Large Language Models [36.869371885656236]
We propose a large language model (LLM) to generate responsible traffic predictions.
By transferring multi-modal traffic data into natural language descriptions, R2T-LLM captures complex spatial-temporal patterns and external factors from comprehensive traffic data.
Empirically, R2T-LLM shows competitive accuracy compared with deep learning baselines, while providing an intuitive and reliable explanation for predictions.
arXiv Detail & Related papers (2024-04-03T07:14:15Z) - LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions with Large Language Models [48.46007039539533]
We propose an explainable lane change prediction model that leverages the strong reasoning capabilities and self-explanation abilities of Large Language Models (LLMs)
Our experiments on the large-scale highD dataset demonstrate the superior performance and interpretability of our LC-LLM in lane change prediction task.
arXiv Detail & Related papers (2024-03-27T08:34:55Z) - TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models [27.306180426294784]
We introduce TPLLM, a novel traffic prediction framework leveraging Large Language Models (LLMs)
In this framework, we construct a sequence embedding layer based on Conal Neural Networks (LoCNNs) and a graph embedding layer based on Graph Contemporalal Networks (GCNs) to extract sequence features and spatial features.
Experiments on two real-world datasets demonstrate commendable performance in both full-sample and few-shot prediction scenarios.
arXiv Detail & Related papers (2024-03-04T17:08:57Z) - Pre-training on Synthetic Driving Data for Trajectory Prediction [64.16991399882477]
We aim to tackle the challenge of learning general trajectory forecasting representations under limited data availability.
We take advantage of graph representations of HD-map and apply vector transformations to reshape the maps.
We employ a rule-based model to generate trajectories based on augmented scenes.
arXiv Detail & Related papers (2023-09-18T19:49:22Z) - TrafficBots: Towards World Models for Autonomous Driving Simulation and
Motion Prediction [149.5716746789134]
We show data-driven traffic simulation can be formulated as a world model.
We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving.
Experiments on the open motion dataset show TrafficBots can simulate realistic multi-agent behaviors.
arXiv Detail & Related papers (2023-03-07T18:28:41Z) - Motion Transformer with Global Intention Localization and Local Movement
Refinement [103.75625476231401]
Motion TRansformer (MTR) models motion prediction as the joint optimization of global intention localization and local movement refinement.
MTR achieves state-of-the-art performance on both the marginal and joint motion prediction challenges.
arXiv Detail & Related papers (2022-09-27T16:23:14Z) - Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data.
We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.