TrajFusionNet: Pedestrian Crossing Intention Prediction via Fusion of Sequential and Visual Trajectory Representations
- URL: http://arxiv.org/abs/2508.19866v1
- Date: Wed, 27 Aug 2025 13:29:15 GMT
- Title: TrajFusionNet: Pedestrian Crossing Intention Prediction via Fusion of Sequential and Visual Trajectory Representations
- Authors: François G. Landry, Moulay A. Akhloufi,
- Abstract summary: TrajFusionNet is a transformer-based model for predicting pedestrian crossing intention.<n>It learns from a sequential representation of the observed and predicted pedestrian trajectory and vehicle speed.<n>It achieves state-of-the-art results across the three most commonly used datasets for pedestrian crossing intention prediction.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the introduction of vehicles with autonomous capabilities on public roads, predicting pedestrian crossing intention has emerged as an active area of research. The task of predicting pedestrian crossing intention involves determining whether pedestrians in the scene are likely to cross the road or not. In this work, we propose TrajFusionNet, a novel transformer-based model that combines future pedestrian trajectory and vehicle speed predictions as priors for predicting crossing intention. TrajFusionNet comprises two branches: a Sequence Attention Module (SAM) and a Visual Attention Module (VAM). The SAM branch learns from a sequential representation of the observed and predicted pedestrian trajectory and vehicle speed. Complementarily, the VAM branch enables learning from a visual representation of the predicted pedestrian trajectory by overlaying predicted pedestrian bounding boxes onto scene images. By utilizing a small number of lightweight modalities, TrajFusionNet achieves the lowest total inference time (including model runtime and data preprocessing) among current state-of-the-art approaches. In terms of performance, it achieves state-of-the-art results across the three most commonly used datasets for pedestrian crossing intention prediction.
Related papers
- Multi-Vehicle Trajectory Prediction at Intersections using State and
Intention Information [50.40632021583213]
Traditional approaches to prediction of future trajectory of road agents rely on knowing information about their past trajectory.
This work instead relies on having knowledge of the current state and intended direction to make predictions for multiple vehicles at intersections.
Message passing of this information between the vehicles provides each one of them a more holistic overview of the environment.
arXiv Detail & Related papers (2023-01-06T15:13:23Z) - Pedestrian Stop and Go Forecasting with Hybrid Feature Fusion [87.77727495366702]
We introduce the new task of pedestrian stop and go forecasting.
Considering the lack of suitable existing datasets for it, we release TRANS, a benchmark for explicitly studying the stop and go behaviors of pedestrians in urban traffic.
We build it from several existing datasets annotated with pedestrians' walking motions, in order to have various scenarios and behaviors.
arXiv Detail & Related papers (2022-03-04T18:39:31Z) - PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction
in 3D [10.580548257913843]
We propose a new pedestrian action prediction dataset created by adding per-frame 2D/3D bounding box and behavioral annotations to nuScenes.
In addition, we propose a hybrid neural network architecture that incorporates various data modalities for predicting pedestrian crossing action.
arXiv Detail & Related papers (2020-12-14T18:13:44Z) - Pedestrian Intention Prediction: A Multi-task Perspective [83.7135926821794]
In order to be globally deployed, autonomous cars must guarantee the safety of pedestrians.
This work tries to solve this problem by jointly predicting the intention and visual states of pedestrians.
The method is a recurrent neural network in a multi-task learning approach.
arXiv Detail & Related papers (2020-10-20T13:42:31Z) - Vehicle Trajectory Prediction in Crowded Highway Scenarios Using Bird
Eye View Representations and CNNs [0.0]
This paper describes a novel approach to perform vehicle trajectory predictions employing graphic representations.
The problem is faced as an image to image regression problem training the network to learn the underlying relations between the traffic participants.
The model has been tested in highway scenarios with more than 30 vehicles simultaneously in two opposite traffic flow streams.
arXiv Detail & Related papers (2020-08-26T11:15:49Z) - TNT: Target-driveN Trajectory Prediction [76.21200047185494]
We develop a target-driven trajectory prediction framework for moving agents.
We benchmark it on trajectory prediction of vehicles and pedestrians.
We outperform state-of-the-art on Argoverse Forecasting, INTERACTION, Stanford Drone and an in-house Pedestrian-at-Intersection dataset.
arXiv Detail & Related papers (2020-08-19T06:52:46Z) - Probabilistic Crowd GAN: Multimodal Pedestrian Trajectory Prediction
using a Graph Vehicle-Pedestrian Attention Network [12.070251470948772]
We show how Probabilistic Crowd GAN can output probabilistic multimodal predictions.
We also propose the use of Graph Vehicle-Pedestrian Attention Network (GVAT), which models social interactions.
We demonstrate improvements on the existing state of the art methods for trajectory prediction and illustrate how the true multimodal and uncertain nature of crowd interactions can be directly modelled.
arXiv Detail & Related papers (2020-06-23T11:25:16Z) - TPNet: Trajectory Proposal Network for Motion Prediction [81.28716372763128]
Trajectory Proposal Network (TPNet) is a novel two-stage motion prediction framework.
TPNet first generates a candidate set of future trajectories as hypothesis proposals, then makes the final predictions by classifying and refining the proposals.
Experiments on four large-scale trajectory prediction datasets, show that TPNet achieves the state-of-the-art results both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-04-26T00:01:49Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.