Large Trajectory Models are Scalable Motion Predictors and Planners
- URL: http://arxiv.org/abs/2310.19620v3
- Date: Wed, 28 Feb 2024 07:37:51 GMT
- Title: Large Trajectory Models are Scalable Motion Predictors and Planners
- Authors: Qiao Sun, Shiduo Zhang, Danjiao Ma, Jingzhe Shi, Derun Li, Simian Luo,
Yu Wang, Ningyi Xu, Guangzhi Cao, Hang Zhao
- Abstract summary: Motion prediction and planning are vital tasks in autonomous driving.
We introduce a scalable trajectory model called State Transformer (STR)
STR reformulates the motion prediction and motion planning problems by arranging observations, states, and actions into one unified sequence modeling task.
- Score: 25.03447801499
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Motion prediction and planning are vital tasks in autonomous driving, and
recent efforts have shifted to machine learning-based approaches. The
challenges include understanding diverse road topologies, reasoning traffic
dynamics over a long time horizon, interpreting heterogeneous behaviors, and
generating policies in a large continuous state space. Inspired by the success
of large language models in addressing similar complexities through model
scaling, we introduce a scalable trajectory model called State Transformer
(STR). STR reformulates the motion prediction and motion planning problems by
arranging observations, states, and actions into one unified sequence modeling
task. Our approach unites trajectory generation problems with other sequence
modeling problems, powering rapid iterations with breakthroughs in neighbor
domains such as language modeling. Remarkably, experimental results reveal that
large trajectory models (LTMs), such as STR, adhere to the scaling laws by
presenting outstanding adaptability and learning efficiency. Qualitative
results further demonstrate that LTMs are capable of making plausible
predictions in scenarios that diverge significantly from the training data
distribution. LTMs also learn to make complex reasonings for long-term
planning, without explicit loss designs or costly high-level annotations.
Related papers
- PILOT: Planning via Internalized Latent Optimization Trajectories for Large Language Models [51.43746425777865]
Large Language Models (LLMs) often lack the capacity to formulate global strategies, leading to error propagation in long-horizon tasks.<n>We propose PILOT, a framework designed to internalize the strategic oversight of large models into intrinsic Latent Guidance.
arXiv Detail & Related papers (2026-01-07T12:38:56Z) - Large Foundation Models for Trajectory Prediction in Autonomous Driving: A Comprehensive Survey [26.44322475984292]
Trayjectory prediction serves as a critical functionality in autonomous driving.<n>The rise of Large Foundation Models (LFMs) is transforming the research paradigm of trajectory prediction.<n>This article highlights three core methodologies: trajectory-language mapping, multimodal fusion, and constraint-based reasoning.
arXiv Detail & Related papers (2025-09-11T10:30:06Z) - Beyond Templates: Dynamic Adaptation of Reasoning Demonstrations via Feasibility-Aware Exploration [15.711365331854614]
We introduce Dynamic Adaptation of Reasoning Trajectories (DART), a novel data adaptation framework.<n>Instead of uniformly imitating expert steps, DART employs a selective imitation strategy guided by step-wise adaptability estimation.<n>We validate DART across multiple reasoning benchmarks and model scales, demonstrating that it significantly improves generalization and data efficiency.
arXiv Detail & Related papers (2025-05-27T04:08:11Z) - Latent Diffusion Planning for Imitation Learning [78.56207566743154]
Latent Diffusion Planning (LDP) is a modular approach consisting of a planner and inverse dynamics model.
By separating planning from action prediction, LDP can benefit from the denser supervision signals of suboptimal and action-free data.
On simulated visual robotic manipulation tasks, LDP outperforms state-of-the-art imitation learning approaches.
arXiv Detail & Related papers (2025-04-23T17:53:34Z) - Model Hemorrhage and the Robustness Limits of Large Language Models [119.46442117681147]
Large language models (LLMs) demonstrate strong performance across natural language processing tasks, yet undergo significant performance degradation when modified for deployment.
We define this phenomenon as model hemorrhage - performance decline caused by parameter alterations and architectural changes.
arXiv Detail & Related papers (2025-03-31T10:16:03Z) - Simultaneous Multi-Robot Motion Planning with Projected Diffusion Models [57.45019514036948]
Simultaneous MRMP Diffusion (SMD) is a novel approach integrating constrained optimization into the diffusion sampling process to produce kinematically feasible trajectories.
The paper introduces a comprehensive MRMP benchmark to evaluate trajectory planning algorithms across scenarios with varying robot densities, obstacle complexities, and motion constraints.
arXiv Detail & Related papers (2025-02-05T20:51:28Z) - DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers [61.92571851411509]
We introduce a multimodal driving language based on interleaved image and action tokens, and develop DrivingGPT to learn joint world modeling and planning.
Our DrivingGPT demonstrates strong performance in both action-conditioned video generation and end-to-end planning, outperforming strong baselines on large-scale nuPlan and NAVSIM benchmarks.
arXiv Detail & Related papers (2024-12-24T18:59:37Z) - Adaptive Planning with Generative Models under Uncertainty [20.922248169620783]
Planning with generative models has emerged as an effective decision-making paradigm across a wide range of domains.
While continuous replanning at each timestep might seem intuitive because it allows decisions to be made based on the most recent environmental observations, it results in substantial computational challenges.
Our work addresses this challenge by introducing a simple adaptive planning policy that leverages the generative model's ability to predict long-horizon state trajectories.
arXiv Detail & Related papers (2024-08-02T18:07:53Z) - Learning Long-Horizon Predictions for Quadrotor Dynamics [48.08477275522024]
We study the key design choices for efficiently learning long-horizon prediction dynamics for quadrotors.
We show that sequential modeling techniques showcase their advantage in minimizing compounding errors compared to other types of solutions.
We propose a novel decoupled dynamics learning approach, which further simplifies the learning process while also enhancing the approach modularity.
arXiv Detail & Related papers (2024-07-17T19:06:47Z) - Spatiotemporal Implicit Neural Representation as a Generalized Traffic Data Learner [46.866240648471894]
Spatiotemporal Traffic Data (STTD) measures the complex dynamical behaviors of the multiscale transportation system.
We present a novel paradigm to address the STTD learning problem by parameterizing STTD as an implicit neural representation.
We validate its effectiveness through extensive experiments in real-world scenarios, showcasing applications from corridor to network scales.
arXiv Detail & Related papers (2024-05-06T06:23:06Z) - TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models [27.306180426294784]
We introduce TPLLM, a novel traffic prediction framework leveraging Large Language Models (LLMs)
In this framework, we construct a sequence embedding layer based on Conal Neural Networks (LoCNNs) and a graph embedding layer based on Graph Contemporalal Networks (GCNs) to extract sequence features and spatial features.
Experiments on two real-world datasets demonstrate commendable performance in both full-sample and few-shot prediction scenarios.
arXiv Detail & Related papers (2024-03-04T17:08:57Z) - Exploring Model Transferability through the Lens of Potential Energy [78.60851825944212]
Transfer learning has become crucial in computer vision tasks due to the vast availability of pre-trained deep learning models.
Existing methods for measuring the transferability of pre-trained models rely on statistical correlations between encoded static features and task labels.
We present an insightful physics-inspired approach named PED to address these challenges.
arXiv Detail & Related papers (2023-08-29T07:15:57Z) - Efficient Dynamics Modeling in Interactive Environments with Koopman Theory [22.7309724944471]
We show how to efficiently parallelize the sequential problem of long-range prediction using convolution.
We also show that this model can be easily incorporated into dynamics modeling for model-based planning and model-free RL.
arXiv Detail & Related papers (2023-06-20T23:38:24Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem.
The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Automating Turbulence Modeling by Multi-Agent Reinforcement Learning [4.784658158364452]
We introduce multi-agent reinforcement learning as an automated discovery tool of turbulence models.
We demonstrate the potential of this approach on Large Eddy Simulations of homogeneous and isotropic turbulence.
arXiv Detail & Related papers (2020-05-18T18:45:09Z) - Context-aware Dynamics Model for Generalization in Model-Based
Reinforcement Learning [124.9856253431878]
We decompose the task of learning a global dynamics model into two stages: (a) learning a context latent vector that captures the local dynamics, then (b) predicting the next state conditioned on it.
In order to encode dynamics-specific information into the context latent vector, we introduce a novel loss function that encourages the context latent vector to be useful for predicting both forward and backward dynamics.
The proposed method achieves superior generalization ability across various simulated robotics and control tasks, compared to existing RL schemes.
arXiv Detail & Related papers (2020-05-14T08:10:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.