SMART: Scalable Multi-agent Real-time Simulation via Next-token Prediction
- URL: http://arxiv.org/abs/2405.15677v1
- Date: Fri, 24 May 2024 16:17:35 GMT
- Title: SMART: Scalable Multi-agent Real-time Simulation via Next-token Prediction
- Authors: Wei Wu, Xiaoxin Feng, Ziyan Gao, Yuheng Kan,
- Abstract summary: We introduce a novel autonomous driving motion generation paradigm that models vectorized map and agent trajectory data into discrete sequence tokens.
These tokens are then processed through a decoder-only transformer architecture to train for the next token prediction task.
We have collected over 1 billion motion tokens from multiple datasets, validating the model's scalability.
- Score: 4.318757942343036
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data-driven autonomous driving motion generation tasks are frequently impacted by the limitations of dataset size and the domain gap between datasets, which precludes their extensive application in real-world scenarios. To address this issue, we introduce SMART, a novel autonomous driving motion generation paradigm that models vectorized map and agent trajectory data into discrete sequence tokens. These tokens are then processed through a decoder-only transformer architecture to train for the next token prediction task across spatial-temporal series. This GPT-style method allows the model to learn the motion distribution in real driving scenarios. SMART achieves state-of-the-art performance across most of the metrics on the generative Sim Agents challenge, ranking 1st on the leaderboards of Waymo Open Motion Dataset (WOMD), demonstrating remarkable inference speed. Moreover, SMART represents the generative model in the autonomous driving motion domain, exhibiting zero-shot generalization capabilities: Using only the NuPlan dataset for training and WOMD for validation, SMART achieved a competitive score of 0.71 on the Sim Agents challenge. Lastly, we have collected over 1 billion motion tokens from multiple datasets, validating the model's scalability. These results suggest that SMART has initially emulated two important properties: scalability and zero-shot generalization, and preliminarily meets the needs of large-scale real-time simulation applications. We have released all the code to promote the exploration of models for motion generation in the autonomous driving field.
Related papers
- Planning with Adaptive World Models for Autonomous Driving [50.4439896514353]
Motion planners (MPs) are crucial for safe navigation in complex urban environments.
nuPlan, a recently released MP benchmark, addresses this limitation by augmenting real-world driving logs with closed-loop simulation logic.
We present AdaptiveDriver, a model-predictive control (MPC) based planner that unrolls different world models conditioned on BehaviorNet's predictions.
arXiv Detail & Related papers (2024-06-15T18:53:45Z) - Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent [53.637837706712794]
We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs.
Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction.
We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
arXiv Detail & Related papers (2024-05-27T22:15:23Z) - BehaviorGPT: Smart Agent Simulation for Autonomous Driving with Next-Patch Prediction [22.254486248785614]
Behavior Generative Pre-trained Transformers (BehaviorGPT) is a decoder-only, autoregressive architecture designed to simulate the sequential motion of multiple agents.
Next-Patch Prediction Paradigm (NP3) enables models to reason at the patch level of trajectories and capture long-range spatial-temporal interactions.
BehaviorGPT ranks first across several metrics on the Sim Agents Benchmark, demonstrating its exceptional performance in multi-agent and agent-map interactions.
arXiv Detail & Related papers (2024-05-27T17:28:25Z) - SIMPL: A Simple and Efficient Multi-agent Motion Prediction Baseline for
Autonomous Driving [27.776472262857045]
This paper presents a Simple and effIcient Motion Prediction baseLine (SIMPL) for autonomous vehicles.
We propose a compact and efficient global feature fusion module that performs directed message passing in a symmetric manner.
As a strong baseline, SIMPL exhibits highly competitive performance on Argoverse 1 & 2 motion forecasting benchmarks.
arXiv Detail & Related papers (2024-02-04T15:07:49Z) - Trajeglish: Traffic Modeling as Next-Token Prediction [67.28197954427638]
A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs.
We apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios.
Our model tops the Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%.
arXiv Detail & Related papers (2023-12-07T18:53:27Z) - Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous
Driving Research [76.93956925360638]
Waymax is a new data-driven simulator for autonomous driving in multi-agent scenes.
It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training.
We benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions.
arXiv Detail & Related papers (2023-10-12T20:49:15Z) - TrafficBots: Towards World Models for Autonomous Driving Simulation and
Motion Prediction [149.5716746789134]
We show data-driven traffic simulation can be formulated as a world model.
We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving.
Experiments on the open motion dataset show TrafficBots can simulate realistic multi-agent behaviors.
arXiv Detail & Related papers (2023-03-07T18:28:41Z) - A Driving Behavior Recognition Model with Bi-LSTM and Multi-Scale CNN [59.57221522897815]
We propose a neural network model based on trajectories information for driving behavior recognition.
We evaluate the proposed model on the public BLVD dataset, achieving a satisfying performance.
arXiv Detail & Related papers (2021-03-01T06:47:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.