Graph-SIM: A Graph-based Spatiotemporal Interaction Modelling for
Pedestrian Action Prediction
- URL: http://arxiv.org/abs/2012.02148v3
- Date: Thu, 25 Mar 2021 14:12:10 GMT
- Title: Graph-SIM: A Graph-based Spatiotemporal Interaction Modelling for
Pedestrian Action Prediction
- Authors: Tiffany Yau, Saber Malekmohammadi, Amir Rasouli, Peter Lakner, Mohsen
Rohani, Jun Luo
- Abstract summary: We propose a novel graph-based model for predicting pedestrian crossing action.
We introduce a new dataset that provides 3D bounding box and pedestrian behavioural annotations for the existing nuScenes dataset.
Our approach achieves state-of-the-art performance by improving on various metrics by more than 15% in comparison to existing methods.
- Score: 10.580548257913843
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the most crucial yet challenging tasks for autonomous vehicles in
urban environments is predicting the future behaviour of nearby pedestrians,
especially at points of crossing. Predicting behaviour depends on many social
and environmental factors, particularly interactions between road users.
Capturing such interactions requires a global view of the scene and dynamics of
the road users in three-dimensional space. This information, however, is
missing from the current pedestrian behaviour benchmark datasets. Motivated by
these challenges, we propose 1) a novel graph-based model for predicting
pedestrian crossing action. Our method models pedestrians' interactions with
nearby road users through clustering and relative importance weighting of
interactions using features obtained from the bird's-eye-view. 2) We introduce
a new dataset that provides 3D bounding box and pedestrian behavioural
annotations for the existing nuScenes dataset. On the new data, our approach
achieves state-of-the-art performance by improving on various metrics by more
than 15% in comparison to existing methods. The dataset is available at
https://github.com/huawei-noah/datasets/PePScenes.
Related papers
- Social-Transmotion: Promptable Human Trajectory Prediction [65.80068316170613]
Social-Transmotion is a generic Transformer-based model that exploits diverse and numerous visual cues to predict human behavior.
Our approach is validated on multiple datasets, including JTA, JRDB, Pedestrians and Cyclists in Road Traffic, and ETH-UCY.
arXiv Detail & Related papers (2023-12-26T18:56:49Z) - Pedestrian Stop and Go Forecasting with Hybrid Feature Fusion [87.77727495366702]
We introduce the new task of pedestrian stop and go forecasting.
Considering the lack of suitable existing datasets for it, we release TRANS, a benchmark for explicitly studying the stop and go behaviors of pedestrians in urban traffic.
We build it from several existing datasets annotated with pedestrians' walking motions, in order to have various scenarios and behaviors.
arXiv Detail & Related papers (2022-03-04T18:39:31Z) - Large Scale Interactive Motion Forecasting for Autonomous Driving : The
Waymo Open Motion Dataset [84.3946567650148]
With over 100,000 scenes, each 20 seconds long at 10 Hz, our new dataset contains more than 570 hours of unique data over 1750 km of roadways.
We use a high-accuracy 3D auto-labeling system to generate high quality 3D bounding boxes for each road agent.
We introduce a new set of metrics that provides a comprehensive evaluation of both single agent and joint agent interaction motion forecasting models.
arXiv Detail & Related papers (2021-04-20T17:19:05Z) - SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory
Prediction [64.16212996247943]
We present a Sparse Graph Convolution Network(SGCN) for pedestrian trajectory prediction.
Specifically, the SGCN explicitly models the sparse directed interaction with a sparse directed spatial graph to capture adaptive interaction pedestrians.
visualizations indicate that our method can capture adaptive interactions between pedestrians and their effective motion tendencies.
arXiv Detail & Related papers (2021-04-04T03:17:42Z) - PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction
in 3D [10.580548257913843]
We propose a new pedestrian action prediction dataset created by adding per-frame 2D/3D bounding box and behavioral annotations to nuScenes.
In addition, we propose a hybrid neural network architecture that incorporates various data modalities for predicting pedestrian crossing action.
arXiv Detail & Related papers (2020-12-14T18:13:44Z) - Multi-Modal Hybrid Architecture for Pedestrian Action Prediction [14.032334569498968]
We propose a novel multi-modal prediction algorithm that incorporates different sources of information captured from the environment to predict future crossing actions of pedestrians.
Using the existing 2D pedestrian behavior benchmarks and a newly annotated 3D driving dataset, we show that our proposed model achieves state-of-the-art performance in pedestrian crossing prediction.
arXiv Detail & Related papers (2020-11-16T15:17:58Z) - End-to-end Contextual Perception and Prediction with Interaction
Transformer [79.14001602890417]
We tackle the problem of detecting objects in 3D and forecasting their future motion in the context of self-driving.
To capture their spatial-temporal dependencies, we propose a recurrent neural network with a novel Transformer architecture.
Our model can be trained end-to-end, and runs in real-time.
arXiv Detail & Related papers (2020-08-13T14:30:12Z) - Graph2Kernel Grid-LSTM: A Multi-Cued Model for Pedestrian Trajectory
Prediction by Learning Adaptive Neighborhoods [10.57164270098353]
We present a new perspective to interaction modeling by proposing that pedestrian neighborhoods can become adaptive in design.
Our model outperforms state-of-the-art approaches that collate resembling features over several publicly-tested surveillance videos.
arXiv Detail & Related papers (2020-07-03T19:05:48Z) - Pedestrian Action Anticipation using Contextual Feature Fusion in
Stacked RNNs [19.13270454742958]
We propose a solution for the problem of pedestrian action anticipation at the point of crossing.
Our approach uses a novel stacked RNN architecture in which information collected from various sources, both scene dynamics and visual features, is gradually fused into the network.
arXiv Detail & Related papers (2020-05-13T20:59:37Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.