MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and
Guided Intention Querying
- URL: http://arxiv.org/abs/2306.17770v2
- Date: Sat, 9 Mar 2024 08:56:22 GMT
- Title: MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and
Guided Intention Querying
- Authors: Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele
- Abstract summary: Motion prediction is crucial for autonomous driving systems to understand complex driving scenarios and make informed decisions.
In this paper, we propose Motion TRansformer (MTR) frameworks to address these challenges.
The initial MTR framework utilizes a transformer encoder-decoder structure with learnable intention queries.
We introduce an advanced MTR++ framework, extending the capability of MTR to simultaneously predict multimodal motion for multiple agents.
- Score: 110.83590008788745
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Motion prediction is crucial for autonomous driving systems to understand
complex driving scenarios and make informed decisions. However, this task is
challenging due to the diverse behaviors of traffic participants and complex
environmental contexts. In this paper, we propose Motion TRansformer (MTR)
frameworks to address these challenges. The initial MTR framework utilizes a
transformer encoder-decoder structure with learnable intention queries,
enabling efficient and accurate prediction of future trajectories. By
customizing intention queries for distinct motion modalities, MTR improves
multimodal motion prediction while reducing reliance on dense goal candidates.
The framework comprises two essential processes: global intention localization,
identifying the agent's intent to enhance overall efficiency, and local
movement refinement, adaptively refining predicted trajectories for improved
accuracy. Moreover, we introduce an advanced MTR++ framework, extending the
capability of MTR to simultaneously predict multimodal motion for multiple
agents. MTR++ incorporates symmetric context modeling and mutually-guided
intention querying modules to facilitate future behavior interaction among
multiple agents, resulting in scene-compliant future trajectories. Extensive
experimental results demonstrate that the MTR framework achieves
state-of-the-art performance on the highly-competitive motion prediction
benchmarks, while the MTR++ framework surpasses its precursor, exhibiting
enhanced performance and efficiency in predicting accurate multimodal future
trajectories for multiple agents.
Related papers
- DeMo: Decoupling Motion Forecasting into Directional Intentions and Dynamic States [6.856351850183536]
We introduce DeMo, a framework that decouples multi-modal trajectory queries into two types.
By leveraging this format, we separately optimize the multi-modality and dynamic evolutionary properties of trajectories.
We additionally introduce combined Attention and Mamba techniques for global information aggregation and state sequence modeling.
arXiv Detail & Related papers (2024-10-08T12:27:49Z) - Multi-scale Temporal Fusion Transformer for Incomplete Vehicle Trajectory Prediction [23.72022120344089]
Motion prediction plays an essential role in autonomous driving systems.
We propose a novel end-to-end framework for incomplete vehicle trajectory prediction.
We evaluate the proposed model on four datasets derived from highway and urban traffic scenarios.
arXiv Detail & Related papers (2024-09-02T02:36:18Z) - Trajeglish: Traffic Modeling as Next-Token Prediction [67.28197954427638]
A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs.
We apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios.
Our model tops the Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%.
arXiv Detail & Related papers (2023-12-07T18:53:27Z) - MotionLM: Multi-Agent Motion Forecasting as Language Modeling [15.317827804763699]
We present MotionLM, a language model for multi-agent motion prediction.
Our approach bypasses post-hoc interactions where individual agent trajectory generation is conducted prior to interactive scoring.
The model's sequential factorization enables temporally causal conditional rollouts.
arXiv Detail & Related papers (2023-09-28T15:46:25Z) - Traj-MAE: Masked Autoencoders for Trajectory Prediction [69.7885837428344]
Trajectory prediction has been a crucial task in building a reliable autonomous driving system by anticipating possible dangers.
We propose an efficient masked autoencoder for trajectory prediction (Traj-MAE) that better represents the complicated behaviors of agents in the driving environment.
Our experimental results in both multi-agent and single-agent settings demonstrate that Traj-MAE achieves competitive results with state-of-the-art methods.
arXiv Detail & Related papers (2023-03-12T16:23:27Z) - JFP: Joint Future Prediction with Interactive Multi-Agent Modeling for
Autonomous Driving [12.460224193998362]
We propose an end-to-end trainable model that learns directly the interaction between pairs of agents in a structured, graphical model formulation.
Our approach improves significantly on the trajectory overlap metrics while obtaining on-par or better performance on single-agent trajectory metrics.
arXiv Detail & Related papers (2022-12-16T20:59:21Z) - Motion Transformer with Global Intention Localization and Local Movement
Refinement [103.75625476231401]
Motion TRansformer (MTR) models motion prediction as the joint optimization of global intention localization and local movement refinement.
MTR achieves state-of-the-art performance on both the marginal and joint motion prediction challenges.
arXiv Detail & Related papers (2022-09-27T16:23:14Z) - Instance-Aware Predictive Navigation in Multi-Agent Environments [93.15055834395304]
We propose an Instance-Aware Predictive Control (IPC) approach, which forecasts interactions between agents as well as future scene structures.
We adopt a novel multi-instance event prediction module to estimate the possible interaction among agents in the ego-centric view.
We design a sequential action sampling strategy to better leverage predicted states on both scene-level and instance-level.
arXiv Detail & Related papers (2021-01-14T22:21:25Z) - Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data.
We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.