ProphNet: Efficient Agent-Centric Motion Forecasting with
Anchor-Informed Proposals
- URL: http://arxiv.org/abs/2303.12071v3
- Date: Wed, 28 Jun 2023 22:25:32 GMT
- Title: ProphNet: Efficient Agent-Centric Motion Forecasting with
Anchor-Informed Proposals
- Authors: Xishun Wang, Tong Su, Fang Da, Xiaodong Yang
- Abstract summary: Motion forecasting is a key module in an autonomous driving system.
Due to the heterogeneous nature of multi-sourced input, multimodality in agent behavior, and low latency required by onboard deployment, this task is notoriously challenging.
This paper proposes a novel agent-centric model with anchor-informed proposals for efficient multimodal motion prediction.
- Score: 6.927103549481412
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motion forecasting is a key module in an autonomous driving system. Due to
the heterogeneous nature of multi-sourced input, multimodality in agent
behavior, and low latency required by onboard deployment, this task is
notoriously challenging. To cope with these difficulties, this paper proposes a
novel agent-centric model with anchor-informed proposals for efficient
multimodal motion prediction. We design a modality-agnostic strategy to
concisely encode the complex input in a unified manner. We generate diverse
proposals, fused with anchors bearing goal-oriented scene context, to induce
multimodal prediction that covers a wide range of future trajectories. Our
network architecture is highly uniform and succinct, leading to an efficient
model amenable for real-world driving deployment. Experiments reveal that our
agent-centric network compares favorably with the state-of-the-art methods in
prediction accuracy, while achieving scene-centric level inference latency.
Related papers
- DeepInteraction++: Multi-Modality Interaction for Autonomous Driving [80.8837864849534]
We introduce a novel modality interaction strategy that allows individual per-modality representations to be learned and maintained throughout.
DeepInteraction++ is a multi-modal interaction framework characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder.
Experiments demonstrate the superior performance of the proposed framework on both 3D object detection and end-to-end autonomous driving tasks.
arXiv Detail & Related papers (2024-08-09T14:04:21Z) - MotionLM: Multi-Agent Motion Forecasting as Language Modeling [15.317827804763699]
We present MotionLM, a language model for multi-agent motion prediction.
Our approach bypasses post-hoc interactions where individual agent trajectory generation is conducted prior to interactive scoring.
The model's sequential factorization enables temporally causal conditional rollouts.
arXiv Detail & Related papers (2023-09-28T15:46:25Z) - MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and
Guided Intention Querying [110.83590008788745]
Motion prediction is crucial for autonomous driving systems to understand complex driving scenarios and make informed decisions.
In this paper, we propose Motion TRansformer (MTR) frameworks to address these challenges.
The initial MTR framework utilizes a transformer encoder-decoder structure with learnable intention queries.
We introduce an advanced MTR++ framework, extending the capability of MTR to simultaneously predict multimodal motion for multiple agents.
arXiv Detail & Related papers (2023-06-30T16:23:04Z) - SIMMF: Semantics-aware Interactive Multiagent Motion Forecasting for
Autonomous Vehicle Driving [2.7195102129095003]
We propose a semantic-aware Interactive Multiagent Motion Forecasting (SIMMF) method to capture semantics along with spatial information.
Specifically, we achieve this by implementing a semantic-aware selection of relevant agents from the scene and passing them through an attention mechanism.
Our results show that the proposed approach outperforms state-of-the-art baselines and provides more accurate and scene-consistent predictions.
arXiv Detail & Related papers (2023-06-26T17:54:24Z) - Traj-MAE: Masked Autoencoders for Trajectory Prediction [69.7885837428344]
Trajectory prediction has been a crucial task in building a reliable autonomous driving system by anticipating possible dangers.
We propose an efficient masked autoencoder for trajectory prediction (Traj-MAE) that better represents the complicated behaviors of agents in the driving environment.
Our experimental results in both multi-agent and single-agent settings demonstrate that Traj-MAE achieves competitive results with state-of-the-art methods.
arXiv Detail & Related papers (2023-03-12T16:23:27Z) - Wayformer: Motion Forecasting via Simple & Efficient Attention Networks [16.031530911221534]
We present Wayformer, a family of attention based architectures for motion forecasting that are simple and homogeneous.
For each fusion type we explore strategies to tradeoff efficiency and quality via factorized attention or latent query attention.
We show that early fusion, despite its simplicity of construction, is not only modality but also achieves state-of-the-art results on both Open MotionDataset (WOMD) and Argoverse leaderboards.
arXiv Detail & Related papers (2022-07-12T21:19:04Z) - Multimodal Motion Prediction with Stacked Transformers [35.9674180611893]
We propose a novel transformer framework for multimodal motion prediction, termed as mmTransformer.
A novel network architecture based on stacked transformers is designed to model the multimodality at feature level with a set of fixed independent proposals.
A region-based training strategy is then developed to induce the multimodality of the generated proposals.
arXiv Detail & Related papers (2021-03-22T07:25:54Z) - Instance-Aware Predictive Navigation in Multi-Agent Environments [93.15055834395304]
We propose an Instance-Aware Predictive Control (IPC) approach, which forecasts interactions between agents as well as future scene structures.
We adopt a novel multi-instance event prediction module to estimate the possible interaction among agents in the ego-centric view.
We design a sequential action sampling strategy to better leverage predicted states on both scene-level and instance-level.
arXiv Detail & Related papers (2021-01-14T22:21:25Z) - DSDNet: Deep Structured self-Driving Network [92.9456652486422]
We propose the Deep Structured self-Driving Network (DSDNet), which performs object detection, motion prediction, and motion planning with a single neural network.
We develop a deep structured energy based model which considers the interactions between actors and produces socially consistent multimodal future predictions.
arXiv Detail & Related papers (2020-08-13T17:54:06Z) - TPNet: Trajectory Proposal Network for Motion Prediction [81.28716372763128]
Trajectory Proposal Network (TPNet) is a novel two-stage motion prediction framework.
TPNet first generates a candidate set of future trajectories as hypothesis proposals, then makes the final predictions by classifying and refining the proposals.
Experiments on four large-scale trajectory prediction datasets, show that TPNet achieves the state-of-the-art results both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-04-26T00:01:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.