Decoder Fusion RNN: Context and Interaction Aware Decoders for
Trajectory Prediction
- URL: http://arxiv.org/abs/2108.05814v1
- Date: Thu, 12 Aug 2021 15:53:37 GMT
- Title: Decoder Fusion RNN: Context and Interaction Aware Decoders for
Trajectory Prediction
- Authors: Edoardo Mello Rella (1), Jan-Nico Zaech (1), Alexander Liniger (1),
Luc Van Gool (1 and 2) ((1) Computer Vision Lab, ETH Z\"uurich (2) PSI, KU
Leuven)
- Abstract summary: We propose a recurrent, attention-based approach for motion forecasting.
Decoder Fusion RNN (DF-RNN) is composed of a recurrent behavior encoder, an inter-agent multi-headed attention module, and a context-aware decoder.
We demonstrate the efficacy of our method by testing it on the Argoverse motion forecasting dataset and show its state-of-the-art performance on the public benchmark.
- Score: 53.473846742702854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Forecasting the future behavior of all traffic agents in the vicinity is a
key task to achieve safe and reliable autonomous driving systems. It is a
challenging problem as agents adjust their behavior depending on their
intentions, the others' actions, and the road layout. In this paper, we propose
Decoder Fusion RNN (DF-RNN), a recurrent, attention-based approach for motion
forecasting. Our network is composed of a recurrent behavior encoder, an
inter-agent multi-headed attention module, and a context-aware decoder. We
design a map encoder that embeds polyline segments, combines them to create a
graph structure, and merges their relevant parts with the agents' embeddings.
We fuse the encoded map information with further inter-agent interactions only
inside the decoder and propose to use explicit training as a method to
effectively utilize the information available. We demonstrate the efficacy of
our method by testing it on the Argoverse motion forecasting dataset and show
its state-of-the-art performance on the public benchmark.
Related papers
- SocialFormer: Social Interaction Modeling with Edge-enhanced Heterogeneous Graph Transformers for Trajectory Prediction [3.733790302392792]
SocialFormer is an agent interaction-aware trajectory prediction method.
We present a temporal encoder based on gated recurrent units (GRU) to model the temporal social behavior of agent movements.
We evaluate SocialFormer for the trajectory prediction task on the popular nuScenes benchmark and achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-05-06T19:47:23Z) - Real-Time Motion Prediction via Heterogeneous Polyline Transformer with
Relative Pose Encoding [121.08841110022607]
Existing agent-centric methods have demonstrated outstanding performance on public benchmarks.
We introduce the K-nearest neighbor attention with relative pose encoding (KNARPE), a novel attention mechanism allowing the pairwise-relative representation to be used by Transformers.
By sharing contexts among agents and reusing the unchanged contexts, our approach is as efficient as scene-centric methods, while performing on par with state-of-the-art agent-centric methods.
arXiv Detail & Related papers (2023-10-19T17:59:01Z) - Fusion-GRU: A Deep Learning Model for Future Bounding Box Prediction of
Traffic Agents in Risky Driving Videos [20.923004256768635]
Fusion-Gated Recurrent Unit (Fusion-GRU) is a novel encoder-decoder architecture for future bounding box localization.
The proposed method is evaluated on two publicly available datasets, ROL and HEV-I.
arXiv Detail & Related papers (2023-08-12T18:35:59Z) - SIMMF: Semantics-aware Interactive Multiagent Motion Forecasting for
Autonomous Vehicle Driving [2.7195102129095003]
We propose a semantic-aware Interactive Multiagent Motion Forecasting (SIMMF) method to capture semantics along with spatial information.
Specifically, we achieve this by implementing a semantic-aware selection of relevant agents from the scene and passing them through an attention mechanism.
Our results show that the proposed approach outperforms state-of-the-art baselines and provides more accurate and scene-consistent predictions.
arXiv Detail & Related papers (2023-06-26T17:54:24Z) - GoRela: Go Relative for Viewpoint-Invariant Motion Forecasting [121.42898228997538]
We propose an efficient shared encoding for all agents and the map without sacrificing accuracy or generalization.
We leverage pair-wise relative positional encodings to represent geometric relationships between the agents and the map elements in a heterogeneous spatial graph.
Our decoder is also viewpoint agnostic, predicting agent goals on the lane graph to enable diverse and context-aware multimodal prediction.
arXiv Detail & Related papers (2022-11-04T16:10:50Z) - LaneRCNN: Distributed Representations for Graph-Centric Motion
Forecasting [104.8466438967385]
LaneRCNN is a graph-centric motion forecasting model.
We learn a local lane graph representation per actor to encode its past motions and the local map topology.
We parameterize the output trajectories based on lane graphs, a more amenable prediction parameterization.
arXiv Detail & Related papers (2021-01-17T11:54:49Z) - Traffic Agent Trajectory Prediction Using Social Convolution and
Attention Mechanism [57.68557165836806]
We propose a model to predict the trajectories of target agents around an autonomous vehicle.
We encode the target agent history trajectories as an attention mask and construct a social map to encode the interactive relationship between the target agent and its surrounding agents.
To verify the effectiveness of our method, we widely compare with several methods on a public dataset, achieving a 20% error decrease.
arXiv Detail & Related papers (2020-07-06T03:48:08Z) - VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized
Representation [74.56282712099274]
This paper introduces VectorNet, a hierarchical graph neural network that exploits the spatial locality of individual road components represented by vectors.
By operating on the vectorized high definition (HD) maps and agent trajectories, we avoid lossy rendering and computationally intensive ConvNet encoding steps.
We evaluate VectorNet on our in-house behavior prediction benchmark and the recently released Argoverse forecasting dataset.
arXiv Detail & Related papers (2020-05-08T19:07:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.