Pedestrian Action Anticipation using Contextual Feature Fusion in
Stacked RNNs
- URL: http://arxiv.org/abs/2005.06582v1
- Date: Wed, 13 May 2020 20:59:37 GMT
- Title: Pedestrian Action Anticipation using Contextual Feature Fusion in
Stacked RNNs
- Authors: Amir Rasouli, Iuliia Kotseruba, John K. Tsotsos
- Abstract summary: We propose a solution for the problem of pedestrian action anticipation at the point of crossing.
Our approach uses a novel stacked RNN architecture in which information collected from various sources, both scene dynamics and visual features, is gradually fused into the network.
- Score: 19.13270454742958
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the major challenges for autonomous vehicles in urban environments is
to understand and predict other road users' actions, in particular, pedestrians
at the point of crossing. The common approach to solving this problem is to use
the motion history of the agents to predict their future trajectories. However,
pedestrians exhibit highly variable actions most of which cannot be understood
without visual observation of the pedestrians themselves and their
surroundings. To this end, we propose a solution for the problem of pedestrian
action anticipation at the point of crossing. Our approach uses a novel stacked
RNN architecture in which information collected from various sources, both
scene dynamics and visual features, is gradually fused into the network at
different levels of processing. We show, via extensive empirical evaluations,
that the proposed algorithm achieves a higher prediction accuracy compared to
alternative recurrent network architectures. We conduct experiments to
investigate the impact of the length of observation, time to event and types of
features on the performance of the proposed method. Finally, we demonstrate how
different data fusion strategies impact prediction accuracy.
Related papers
- Knowledge-aware Graph Transformer for Pedestrian Trajectory Prediction [15.454206825258169]
Predicting pedestrian motion trajectories is crucial for path planning and motion control of autonomous vehicles.
Recent deep learning-based prediction approaches mainly utilize information like trajectory history and interactions between pedestrians.
This paper proposes a graph transformer structure to improve prediction performance.
arXiv Detail & Related papers (2024-01-10T01:50:29Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - Behavioral Intention Prediction in Driving Scenes: A Survey [70.53285924851767]
Behavioral Intention Prediction (BIP) simulates a human consideration process and fulfills the early prediction of specific behaviors.
This work provides a comprehensive review of BIP from the available datasets, key factors and challenges, pedestrian-centric and vehicle-centric BIP approaches, and BIP-aware applications.
arXiv Detail & Related papers (2022-11-01T11:07:37Z) - PedFormer: Pedestrian Behavior Prediction via Cross-Modal Attention
Modulation and Gated Multitask Learning [10.812772606528172]
We propose a novel framework that relies on different data modalities to predict future trajectories and crossing actions of pedestrians from an ego-centric perspective.
We show that our model improves state-of-the-art in trajectory and action prediction by up to 22% and 13% respectively on various metrics.
arXiv Detail & Related papers (2022-10-14T15:12:00Z) - Pedestrian Stop and Go Forecasting with Hybrid Feature Fusion [87.77727495366702]
We introduce the new task of pedestrian stop and go forecasting.
Considering the lack of suitable existing datasets for it, we release TRANS, a benchmark for explicitly studying the stop and go behaviors of pedestrians in urban traffic.
We build it from several existing datasets annotated with pedestrians' walking motions, in order to have various scenarios and behaviors.
arXiv Detail & Related papers (2022-03-04T18:39:31Z) - Context-Aware Scene Prediction Network (CASPNet) [3.390468002706074]
We jointly learn and predict the motion of all road users in a scene using a novel convolutional neural network (CNN) and recurrent neural network (RNN) based architecture.
Our approach reaches state-of-the-art results in the prediction benchmark.
arXiv Detail & Related papers (2022-01-18T12:52:01Z) - Predicting Pedestrian Crossing Intention with Feature Fusion and
Spatio-Temporal Attention [0.0]
Pedestrian crossing intention should be recognized in real-time for urban driving.
Recent works have shown the potential of using vision-based deep neural network models for this task.
This work introduces a neural network architecture to fuse inherently different novel-temporal features for pedestrian crossing intention prediction.
arXiv Detail & Related papers (2021-04-12T14:10:25Z) - Multi-Modal Hybrid Architecture for Pedestrian Action Prediction [14.032334569498968]
We propose a novel multi-modal prediction algorithm that incorporates different sources of information captured from the environment to predict future crossing actions of pedestrians.
Using the existing 2D pedestrian behavior benchmarks and a newly annotated 3D driving dataset, we show that our proposed model achieves state-of-the-art performance in pedestrian crossing prediction.
arXiv Detail & Related papers (2020-11-16T15:17:58Z) - Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data.
We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z) - The Importance of Prior Knowledge in Precise Multimodal Prediction [71.74884391209955]
Roads have well defined geometries, topologies, and traffic rules.
In this paper we propose to incorporate structured priors as a loss function.
We demonstrate the effectiveness of our approach on real-world self-driving datasets.
arXiv Detail & Related papers (2020-06-04T03:56:11Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.