Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction
- URL: http://arxiv.org/abs/2002.08945v1
- Date: Thu, 20 Feb 2020 18:50:44 GMT
- Title: Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction
- Authors: Bingbin Liu, Ehsan Adeli, Zhangjie Cao, Kuan-Hui Lee, Abhijeet Shenoi,
Adrien Gaidon, Juan Carlos Niebles
- Abstract summary: Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
- Score: 57.56466850377598
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reasoning over visual data is a desirable capability for robotics and
vision-based applications. Such reasoning enables forecasting of the next
events or actions in videos. In recent years, various models have been
developed based on convolution operations for prediction or forecasting, but
they lack the ability to reason over spatiotemporal data and infer the
relationships of different objects in the scene. In this paper, we present a
framework based on graph convolution to uncover the spatiotemporal
relationships in the scene for reasoning about pedestrian intent. A scene graph
is built on top of segmented object instances within and across video frames.
Pedestrian intent, defined as the future action of crossing or not-crossing the
street, is a very crucial piece of information for autonomous vehicles to
navigate safely and more smoothly. We approach the problem of intent prediction
from two different perspectives and anticipate the intention-to-cross within
both pedestrian-centric and location-centric scenarios. In addition, we
introduce a new dataset designed specifically for autonomous-driving scenarios
in areas with dense pedestrian populations: the Stanford-TRI Intent Prediction
(STIP) dataset. Our experiments on STIP and another benchmark dataset show that
our graph modeling framework is able to predict the intention-to-cross of the
pedestrians with an accuracy of 79.10% on STIP and 79.28% on \rev{Joint
Attention for Autonomous Driving (JAAD) dataset up to one second earlier than
when the actual crossing happens. These results outperform the baseline and
previous work. Please refer to http://stip.stanford.edu/ for the dataset and
code.
Related papers
- HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention [76.37139809114274]
HPNet is a novel dynamic trajectory forecasting method.
We propose a Historical Prediction Attention module to automatically encode the dynamic relationship between successive predictions.
Our code is available at https://github.com/XiaolongTang23/HPNet.
arXiv Detail & Related papers (2024-04-09T14:42:31Z) - JRDB-Traj: A Dataset and Benchmark for Trajectory Forecasting in Crowds [79.00975648564483]
Trajectory forecasting models, employed in fields such as robotics, autonomous vehicles, and navigation, face challenges in real-world scenarios.
This dataset provides comprehensive data, including the locations of all agents, scene images, and point clouds, all from the robot's perspective.
The objective is to predict the future positions of agents relative to the robot using raw sensory input data.
arXiv Detail & Related papers (2023-11-05T18:59:31Z) - Pedestrian Stop and Go Forecasting with Hybrid Feature Fusion [87.77727495366702]
We introduce the new task of pedestrian stop and go forecasting.
Considering the lack of suitable existing datasets for it, we release TRANS, a benchmark for explicitly studying the stop and go behaviors of pedestrians in urban traffic.
We build it from several existing datasets annotated with pedestrians' walking motions, in order to have various scenarios and behaviors.
arXiv Detail & Related papers (2022-03-04T18:39:31Z) - LOKI: Long Term and Key Intentions for Trajectory Prediction [22.097307597204736]
Recent advances in trajectory prediction have shown that explicit reasoning about agents' intent is important to accurately forecast their motion.
We propose LOKI (LOng term and Key Intentions), a novel large-scale dataset that is designed to tackle joint trajectory and intention prediction.
We show our method outperforms state-of-the-art trajectory prediction methods by upto $27%$ and also provide a baseline for frame-wise intention estimation.
arXiv Detail & Related papers (2021-08-18T16:57:03Z) - Is attention to bounding boxes all you need for pedestrian action
prediction? [1.3999481573773074]
We present a framework based on multiple variations of the Transformer models to reason attentively about the dynamic evolution of the pedestrians' past trajectory.
We prove that using only bounding boxes as input to our model can outperform the previous state-of-the-art models.
Our model has similarly reached high accuracy (91 and F1-score (0.91) on this dataset.
arXiv Detail & Related papers (2021-07-16T17:47:32Z) - Safety-Oriented Pedestrian Motion and Scene Occupancy Forecasting [91.69900691029908]
We advocate for predicting both the individual motions as well as the scene occupancy map.
We propose a Scene-Actor Graph Neural Network (SA-GNN) which preserves the relative spatial information of pedestrians.
On two large-scale real-world datasets, we showcase that our scene-occupancy predictions are more accurate and better calibrated than those from state-of-the-art motion forecasting methods.
arXiv Detail & Related papers (2021-01-07T06:08:21Z) - PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction
in 3D [10.580548257913843]
We propose a new pedestrian action prediction dataset created by adding per-frame 2D/3D bounding box and behavioral annotations to nuScenes.
In addition, we propose a hybrid neural network architecture that incorporates various data modalities for predicting pedestrian crossing action.
arXiv Detail & Related papers (2020-12-14T18:13:44Z) - Graph-SIM: A Graph-based Spatiotemporal Interaction Modelling for
Pedestrian Action Prediction [10.580548257913843]
We propose a novel graph-based model for predicting pedestrian crossing action.
We introduce a new dataset that provides 3D bounding box and pedestrian behavioural annotations for the existing nuScenes dataset.
Our approach achieves state-of-the-art performance by improving on various metrics by more than 15% in comparison to existing methods.
arXiv Detail & Related papers (2020-12-03T18:28:27Z) - Pedestrian Intention Prediction: A Multi-task Perspective [83.7135926821794]
In order to be globally deployed, autonomous cars must guarantee the safety of pedestrians.
This work tries to solve this problem by jointly predicting the intention and visual states of pedestrians.
The method is a recurrent neural network in a multi-task learning approach.
arXiv Detail & Related papers (2020-10-20T13:42:31Z) - STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and
Trajectory Prediction [24.855059537779294]
We present a novel end-to-end two-stage network: Spatio--Interactive Network (STINet)
In addition to 3D geometry of pedestrians, we model temporal information for each of the pedestrians.
Our method predicts both current and past locations in the first stage, so that each pedestrian can be linked across frames.
arXiv Detail & Related papers (2020-05-08T18:43:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.