Related papers: Trajeglish: Traffic Modeling as Next-Token Prediction

Trajeglish: Traffic Modeling as Next-Token Prediction

URL: http://arxiv.org/abs/2312.04535v2
Date: Sun, 14 Apr 2024 22:51:18 GMT
Title: Trajeglish: Traffic Modeling as Next-Token Prediction
Authors: Jonah Philion, Xue Bin Peng, Sanja Fidler,
Abstract summary: A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs. We apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios. Our model tops the Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%.
Score: 67.28197954427638
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs. In pursuit of this functionality, we apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios. Using a simple data-driven tokenization scheme, we discretize trajectories to centimeter-level resolution using a small vocabulary. We then model the multi-agent sequence of discrete motion tokens with a GPT-like encoder-decoder that is autoregressive in time and takes into account intra-timestep interaction between agents. Scenarios sampled from our model exhibit state-of-the-art realism; our model tops the Waymo Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%. We ablate our modeling choices in full autonomy and partial autonomy settings, and show that the representations learned by our model can quickly be adapted to improve performance on nuScenes. We additionally evaluate the scalability of our model with respect to parameter count and dataset size, and use density estimates from our model to quantify the saliency of context length and intra-timestep interaction for the traffic modeling task.

Related papers

DriveGPT: Scaling Autoregressive Behavior Models for Driving [11.733428769776204]
We present DriveGPT, a scalable behavior model for autonomous driving. We learn a transformer model to predict future agent states as tokens in an autoregressive fashion. We scale up our model parameters and training data by multiple orders of magnitude, enabling us to explore the scaling properties.
arXiv Detail & Related papers (2024-12-19T00:06:09Z)
MotionLM: Multi-Agent Motion Forecasting as Language Modeling [15.317827804763699]
We present MotionLM, a language model for multi-agent motion prediction. Our approach bypasses post-hoc interactions where individual agent trajectory generation is conducted prior to interactive scoring. The model's sequential factorization enables temporally causal conditional rollouts.
arXiv Detail & Related papers (2023-09-28T15:46:25Z)
FollowNet: A Comprehensive Benchmark for Car-Following Behavior Modeling [20.784555362703294]
We establish a public benchmark dataset for car-following behavior modeling. The benchmark consists of more than 80K car-following events extracted from five public driving datasets. Results show that the deep deterministic policy gradient (DDPG) based model performs competitively with a lower MSE for spacing.
arXiv Detail & Related papers (2023-05-25T08:59:26Z)
A Control-Centric Benchmark for Video Prediction [69.22614362800692]
We propose a benchmark for action-conditioned video prediction in the form of a control benchmark. Our benchmark includes simulated environments with 11 task categories and 310 task instance definitions. We then leverage our benchmark to study the effects of scaling model size, quantity of training data, and model ensembling.
arXiv Detail & Related papers (2023-04-26T17:59:45Z)
TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction [149.5716746789134]
We show data-driven traffic simulation can be formulated as a world model. We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving. Experiments on the open motion dataset show TrafficBots can simulate realistic multi-agent behaviors.
arXiv Detail & Related papers (2023-03-07T18:28:41Z)
A Spatio-Temporal Multilayer Perceptron for Gesture Recognition [70.34489104710366]
We propose a multilayer state-weighted perceptron for gesture recognition in the context of autonomous vehicles. An evaluation of TCG and Drive&Act datasets is provided to showcase the promising performance of our approach. We deploy our model to our autonomous vehicle to show its real-time capability and stable execution.
arXiv Detail & Related papers (2022-04-25T08:42:47Z)
Predicting Take-over Time for Autonomous Driving with Real-World Data: Robust Data Augmentation, Models, and Evaluation [11.007092387379076]
We develop and train take-over time (TOT) models that operate on mid and high-level features produced by computer vision algorithms operating on different driver-facing camera views. We show that a TOT model supported by augmented data can be used to produce continuous estimates of take-over times without delay.
arXiv Detail & Related papers (2021-07-27T16:39:50Z)
Imagining The Road Ahead: Multi-Agent Trajectory Prediction via Differentiable Simulation [17.953880589741438]
We develop a deep generative model built on a fully differentiable simulator for trajectory prediction. We achieve state-of-the-art results on the INTERACTION dataset, using standard neural architectures and a standard variational training objective. We name our model ITRA, for "Imagining the Road Ahead"
arXiv Detail & Related papers (2021-04-22T17:48:08Z)
A Driving Behavior Recognition Model with Bi-LSTM and Multi-Scale CNN [59.57221522897815]
We propose a neural network model based on trajectories information for driving behavior recognition. We evaluate the proposed model on the public BLVD dataset, achieving a satisfying performance.
arXiv Detail & Related papers (2021-03-01T06:47:29Z)
Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data. We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.