GoRela: Go Relative for Viewpoint-Invariant Motion Forecasting
- URL: http://arxiv.org/abs/2211.02545v1
- Date: Fri, 4 Nov 2022 16:10:50 GMT
- Title: GoRela: Go Relative for Viewpoint-Invariant Motion Forecasting
- Authors: Alexander Cui, Sergio Casas, Kelvin Wong, Simon Suo, Raquel Urtasun
- Abstract summary: We propose an efficient shared encoding for all agents and the map without sacrificing accuracy or generalization.
We leverage pair-wise relative positional encodings to represent geometric relationships between the agents and the map elements in a heterogeneous spatial graph.
Our decoder is also viewpoint agnostic, predicting agent goals on the lane graph to enable diverse and context-aware multimodal prediction.
- Score: 121.42898228997538
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task of motion forecasting is critical for self-driving vehicles (SDVs)
to be able to plan a safe maneuver. Towards this goal, modern approaches reason
about the map, the agents' past trajectories and their interactions in order to
produce accurate forecasts. The predominant approach has been to encode the map
and other agents in the reference frame of each target agent. However, this
approach is computationally expensive for multi-agent prediction as inference
needs to be run for each agent. To tackle the scaling challenge, the solution
thus far has been to encode all agents and the map in a shared coordinate frame
(e.g., the SDV frame). However, this is sample inefficient and vulnerable to
domain shift (e.g., when the SDV visits uncommon states). In contrast, in this
paper, we propose an efficient shared encoding for all agents and the map
without sacrificing accuracy or generalization. Towards this goal, we leverage
pair-wise relative positional encodings to represent geometric relationships
between the agents and the map elements in a heterogeneous spatial graph. This
parameterization allows us to be invariant to scene viewpoint, and save online
computation by re-using map embeddings computed offline. Our decoder is also
viewpoint agnostic, predicting agent goals on the lane graph to enable diverse
and context-aware multimodal prediction. We demonstrate the effectiveness of
our approach on the urban Argoverse 2 benchmark as well as a novel highway
dataset.
Related papers
- Towards Consistent and Explainable Motion Prediction using Heterogeneous Graph Attention [0.17476232824732776]
This paper introduces a new refinement module designed to project the predicted trajectories back onto the actual map.
We also propose a novel scene encoder that handles all relations between agents and their environment in a single unified graph attention network.
arXiv Detail & Related papers (2024-05-16T14:31:15Z) - SemanticFormer: Holistic and Semantic Traffic Scene Representation for Trajectory Prediction using Knowledge Graphs [3.733790302392792]
Tray prediction in autonomous driving relies on accurate representation of all relevant contexts of the driving scene.
We present SemanticFormer, an approach for predicting multimodal trajectories by reasoning over a traffic scene graph.
arXiv Detail & Related papers (2024-04-30T09:11:04Z) - ADAPT: Efficient Multi-Agent Trajectory Prediction with Adaptation [0.0]
ADAPT is a novel approach for jointly predicting the trajectories of all agents in the scene with dynamic weight learning.
Our approach outperforms state-of-the-art methods in both single-agent and multi-agent settings.
arXiv Detail & Related papers (2023-07-26T13:41:51Z) - Decoder Fusion RNN: Context and Interaction Aware Decoders for
Trajectory Prediction [53.473846742702854]
We propose a recurrent, attention-based approach for motion forecasting.
Decoder Fusion RNN (DF-RNN) is composed of a recurrent behavior encoder, an inter-agent multi-headed attention module, and a context-aware decoder.
We demonstrate the efficacy of our method by testing it on the Argoverse motion forecasting dataset and show its state-of-the-art performance on the public benchmark.
arXiv Detail & Related papers (2021-08-12T15:53:37Z) - LaneRCNN: Distributed Representations for Graph-Centric Motion
Forecasting [104.8466438967385]
LaneRCNN is a graph-centric motion forecasting model.
We learn a local lane graph representation per actor to encode its past motions and the local map topology.
We parameterize the output trajectories based on lane graphs, a more amenable prediction parameterization.
arXiv Detail & Related papers (2021-01-17T11:54:49Z) - Learning Lane Graph Representations for Motion Forecasting [92.88572392790623]
We construct a lane graph from raw map data to preserve the map structure.
We exploit a fusion network consisting of four types of interactions, actor-to-lane, lane-to-lane, lane-to-actor and actor-to-actor.
Our approach significantly outperforms the state-of-the-art on the large scale Argoverse motion forecasting benchmark.
arXiv Detail & Related papers (2020-07-27T17:59:49Z) - Traffic Agent Trajectory Prediction Using Social Convolution and
Attention Mechanism [57.68557165836806]
We propose a model to predict the trajectories of target agents around an autonomous vehicle.
We encode the target agent history trajectories as an attention mask and construct a social map to encode the interactive relationship between the target agent and its surrounding agents.
To verify the effectiveness of our method, we widely compare with several methods on a public dataset, achieving a 20% error decrease.
arXiv Detail & Related papers (2020-07-06T03:48:08Z) - VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized
Representation [74.56282712099274]
This paper introduces VectorNet, a hierarchical graph neural network that exploits the spatial locality of individual road components represented by vectors.
By operating on the vectorized high definition (HD) maps and agent trajectories, we avoid lossy rendering and computationally intensive ConvNet encoding steps.
We evaluate VectorNet on our in-house behavior prediction benchmark and the recently released Argoverse forecasting dataset.
arXiv Detail & Related papers (2020-05-08T19:07:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.