Scene Transformer: A unified multi-task model for behavior prediction
and planning
- URL: http://arxiv.org/abs/2106.08417v1
- Date: Tue, 15 Jun 2021 20:20:44 GMT
- Title: Scene Transformer: A unified multi-task model for behavior prediction
and planning
- Authors: Jiquan Ngiam, Benjamin Caine, Vijay Vasudevan, Zhengdong Zhang,
Hao-Tien Lewis Chiang, Jeffrey Ling, Rebecca Roelofs, Alex Bewley, Chenxi
Liu, Ashish Venugopal, David Weiss, Ben Sapp, Zhifeng Chen, Jonathon Shlens
- Abstract summary: We formulate a model for predicting the behavior of all agents jointly in real-world driving environments.
Inspired by recent language modeling approaches, we use a masking strategy as the query to our model.
We evaluate our approach on autonomous driving datasets for behavior prediction, and achieve state-of-the-art performance.
- Score: 42.758178896204036
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Predicting the future motion of multiple agents is necessary for planning in
dynamic environments. This task is challenging for autonomous driving since
agents (e.g., vehicles and pedestrians) and their associated behaviors may be
diverse and influence each other. Most prior work has focused on first
predicting independent futures for each agent based on all past motion, and
then planning against these independent predictions. However, planning against
fixed predictions can suffer from the inability to represent the future
interaction possibilities between different agents, leading to sub-optimal
planning. In this work, we formulate a model for predicting the behavior of all
agents jointly in real-world driving environments in a unified manner. Inspired
by recent language modeling approaches, we use a masking strategy as the query
to our model, enabling one to invoke a single model to predict agent behavior
in many ways, such as potentially conditioned on the goal or full future
trajectory of the autonomous vehicle or the behavior of other agents in the
environment. Our model architecture fuses heterogeneous world state in a
unified Transformer architecture by employing attention across road elements,
agent interactions and time steps. We evaluate our approach on autonomous
driving datasets for behavior prediction, and achieve state-of-the-art
performance. Our work demonstrates that formulating the problem of behavior
prediction in a unified architecture with a masking strategy may allow us to
have a single model that can perform multiple motion prediction and planning
related tasks effectively.
Related papers
- PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving [57.89801036693292]
PPAD (Iterative Interaction of Prediction and Planning Autonomous Driving) considers the timestep-wise interaction to better integrate prediction and planning.
We design ego-to-agent, ego-to-map, and ego-to-BEV interaction mechanisms with hierarchical dynamic key objects attention to better model the interactions.
arXiv Detail & Related papers (2023-11-14T11:53:24Z) - Deep Interactive Motion Prediction and Planning: Playing Games with
Motion Prediction Models [162.21629604674388]
This work presents a game-theoretic Model Predictive Controller (MPC) that uses a novel interactive multi-agent neural network policy as part of its predictive model.
Fundamental to the success of our method is the design of a novel multi-agent policy network that can steer a vehicle given the state of the surrounding agents and the map information.
arXiv Detail & Related papers (2022-04-05T17:58:18Z) - Instance-Aware Predictive Navigation in Multi-Agent Environments [93.15055834395304]
We propose an Instance-Aware Predictive Control (IPC) approach, which forecasts interactions between agents as well as future scene structures.
We adopt a novel multi-instance event prediction module to estimate the possible interaction among agents in the ego-centric view.
We design a sequential action sampling strategy to better leverage predicted states on both scene-level and instance-level.
arXiv Detail & Related papers (2021-01-14T22:21:25Z) - Reactive motion planning with probabilistic safety guarantees [27.91467018272684]
This paper considers the problem of motion planning in environments with multiple uncontrolled agents.
A predictive model of the uncontrolled agents is trained to predict all possible trajectories within a short horizon based on the scenario.
The proposed approach is demonstrated in simulation in a scenario emulating autonomous highway driving.
arXiv Detail & Related papers (2020-11-06T20:37:15Z) - What-If Motion Prediction for Autonomous Driving [58.338520347197765]
Viable solutions must account for both the static geometric context, such as road lanes, and dynamic social interactions arising from multiple actors.
We propose a recurrent graph-based attentional approach with interpretable geometric (actor-lane) and social (actor-actor) relationships.
Our model can produce diverse predictions conditioned on hypothetical or "what-if" road lanes and multi-actor interactions.
arXiv Detail & Related papers (2020-08-24T17:49:30Z) - Multimodal Deep Generative Models for Trajectory Prediction: A
Conditional Variational Autoencoder Approach [34.70843462687529]
We provide a self-contained tutorial on a conditional variational autoencoder approach to human behavior prediction.
The goals of this tutorial paper are to review and build a taxonomy of state-of-the-art methods in human behavior prediction.
arXiv Detail & Related papers (2020-08-10T03:18:27Z) - The Importance of Prior Knowledge in Precise Multimodal Prediction [71.74884391209955]
Roads have well defined geometries, topologies, and traffic rules.
In this paper we propose to incorporate structured priors as a loss function.
We demonstrate the effectiveness of our approach on real-world self-driving datasets.
arXiv Detail & Related papers (2020-06-04T03:56:11Z) - Social-WaGDAT: Interaction-aware Trajectory Prediction via Wasserstein
Graph Double-Attention Network [29.289670231364788]
In this paper, we propose a generic generative neural system for multi-agent trajectory prediction.
We also employ an efficient kinematic constraint layer applied to vehicle trajectory prediction.
The proposed system is evaluated on three public benchmark datasets for trajectory prediction.
arXiv Detail & Related papers (2020-02-14T20:11:13Z) - Trajectron++: Dynamically-Feasible Trajectory Forecasting With
Heterogeneous Data [37.176411554794214]
Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation.
We present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents.
We demonstrate its performance on several challenging real-world trajectory forecasting datasets.
arXiv Detail & Related papers (2020-01-09T16:47:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.