SemanticFormer: Holistic and Semantic Traffic Scene Representation for Trajectory Prediction using Knowledge Graphs
- URL: http://arxiv.org/abs/2404.19379v3
- Date: Mon, 1 Jul 2024 04:51:21 GMT
- Title: SemanticFormer: Holistic and Semantic Traffic Scene Representation for Trajectory Prediction using Knowledge Graphs
- Authors: Zhigang Sun, Zixu Wang, Lavdim Halilaj, Juergen Luettin,
- Abstract summary: Tray prediction in autonomous driving relies on accurate representation of all relevant contexts of the driving scene.
We present SemanticFormer, an approach for predicting multimodal trajectories by reasoning over a traffic scene graph.
- Score: 3.733790302392792
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Trajectory prediction in autonomous driving relies on accurate representation of all relevant contexts of the driving scene, including traffic participants, road topology, traffic signs, as well as their semantic relations to each other. Despite increased attention to this issue, most approaches in trajectory prediction do not consider all of these factors sufficiently. We present SemanticFormer, an approach for predicting multimodal trajectories by reasoning over a semantic traffic scene graph using a hybrid approach. It utilizes high-level information in the form of meta-paths, i.e. trajectories on which an agent is allowed to drive from a knowledge graph which is then processed by a novel pipeline based on multiple attention mechanisms to predict accurate trajectories. SemanticFormer comprises a hierarchical heterogeneous graph encoder to capture spatio-temporal and relational information across agents as well as between agents and road elements. Further, it includes a predictor to fuse different encodings and decode trajectories with probabilities. Finally, a refinement module assesses permitted meta-paths of trajectories and speed profiles to obtain final predicted trajectories. Evaluation of the nuScenes benchmark demonstrates improved performance compared to several SOTA methods. In addition, we demonstrate that our knowledge graph can be easily added to two graph-based existing SOTA methods, namely VectorNet and Laformer, replacing their original homogeneous graphs. The evaluation results suggest that by adding our knowledge graph the performance of the original methods is enhanced by 5% and 4%, respectively.
Related papers
- Towards Consistent and Explainable Motion Prediction using Heterogeneous Graph Attention [0.17476232824732776]
This paper introduces a new refinement module designed to project the predicted trajectories back onto the actual map.
We also propose a novel scene encoder that handles all relations between agents and their environment in a single unified graph attention network.
arXiv Detail & Related papers (2024-05-16T14:31:15Z) - nuScenes Knowledge Graph -- A comprehensive semantic representation of
traffic scenes for trajectory prediction [6.23221362105447]
Trajectory prediction in traffic scenes involves accurately forecasting the behaviour of surrounding vehicles.
It is crucial to consider contextual information, including the driving path of vehicles, road topology, lane dividers, and traffic rules.
This paper presents an approach that utilizes knowledge graphs to model the diverse entities and their semantic connections within traffic scenes.
arXiv Detail & Related papers (2023-12-15T10:40:34Z) - Heterogeneous Graph-based Trajectory Prediction using Local Map Context
and Social Interactions [47.091620047301305]
We present a novel approach for vector-based trajectory prediction that addresses shortcomings by leveraging three crucial sources of information.
First, we model interactions between traffic agents by a semantic scene graph, that accounts for the nature and important features of their relation.
Second, we extract agent-centric image-based map features to model the local map context.
arXiv Detail & Related papers (2023-11-30T13:46:05Z) - Pre-training on Synthetic Driving Data for Trajectory Prediction [61.520225216107306]
We propose a pipeline-level solution to mitigate the issue of data scarcity in trajectory forecasting.
We adopt HD map augmentation and trajectory synthesis for generating driving data, and then we learn representations by pre-training on them.
We conduct extensive experiments to demonstrate the effectiveness of our data expansion and pre-training strategies.
arXiv Detail & Related papers (2023-09-18T19:49:22Z) - GoRela: Go Relative for Viewpoint-Invariant Motion Forecasting [121.42898228997538]
We propose an efficient shared encoding for all agents and the map without sacrificing accuracy or generalization.
We leverage pair-wise relative positional encodings to represent geometric relationships between the agents and the map elements in a heterogeneous spatial graph.
Our decoder is also viewpoint agnostic, predicting agent goals on the lane graph to enable diverse and context-aware multimodal prediction.
arXiv Detail & Related papers (2022-11-04T16:10:50Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Trajectory Prediction with Graph-based Dual-scale Context Fusion [43.51107329748957]
We present a graph-based trajectory prediction network named the Dual Scale Predictor.
It encodes both the static and dynamical driving context in a hierarchical manner.
Thanks to the proposed dual-scale context fusion network, our DSP is able to generate accurate and human-like multi-modal trajectories.
arXiv Detail & Related papers (2021-11-02T13:42:16Z) - Semantics-STGCNN: A Semantics-guided Spatial-Temporal Graph
Convolutional Network for Multi-class Trajectory Prediction [9.238700679836855]
We introduce class information into a graph convolutional neural network to better predict the trajectory of an individual.
We propose new metrics, known as Average2 Displacement Error (aADE) and Average Final Displacement Error (aFDE)
It consistently shows superior performance to the state-of-the-arts in existing and the newly proposed metrics.
arXiv Detail & Related papers (2021-08-10T15:02:50Z) - SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory
Prediction [64.16212996247943]
We present a Sparse Graph Convolution Network(SGCN) for pedestrian trajectory prediction.
Specifically, the SGCN explicitly models the sparse directed interaction with a sparse directed spatial graph to capture adaptive interaction pedestrians.
visualizations indicate that our method can capture adaptive interactions between pedestrians and their effective motion tendencies.
arXiv Detail & Related papers (2021-04-04T03:17:42Z) - Spatio-Temporal Graph Dual-Attention Network for Multi-Agent Prediction
and Tracking [23.608125748229174]
We propose a generic generative neural system for multi-agent trajectory prediction involving heterogeneous agents.
The proposed system is evaluated on three public benchmark datasets for trajectory prediction.
arXiv Detail & Related papers (2021-02-18T02:25:35Z) - VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized
Representation [74.56282712099274]
This paper introduces VectorNet, a hierarchical graph neural network that exploits the spatial locality of individual road components represented by vectors.
By operating on the vectorized high definition (HD) maps and agent trajectories, we avoid lossy rendering and computationally intensive ConvNet encoding steps.
We evaluate VectorNet on our in-house behavior prediction benchmark and the recently released Argoverse forecasting dataset.
arXiv Detail & Related papers (2020-05-08T19:07:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.