CCF: Cross Correcting Framework for Pedestrian Trajectory Prediction
- URL: http://arxiv.org/abs/2406.00749v1
- Date: Sun, 2 Jun 2024 14:07:13 GMT
- Title: CCF: Cross Correcting Framework for Pedestrian Trajectory Prediction
- Authors: Pranav Singh Chib, Pravendra Singh,
- Abstract summary: We propose a Cross-Correction Framework (CCF) to learn representations of pedestrian trajectories better.
CCF consists of two prediction models which are trained with both cross-temporal loss and trajectory prediction loss.
We utilize transformer-based encoder-correction-decoder architecture each to capture motion and social interaction among pedestrians.
- Score: 7.9449756510822915
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Accurately predicting future pedestrian trajectories is crucial across various domains. Due to the uncertainty in future pedestrian trajectories, it is important to learn complex spatio-temporal representations in multi-agent scenarios. To address this, we propose a novel Cross-Correction Framework (CCF) to learn spatio-temporal representations of pedestrian trajectories better. Our framework consists of two trajectory prediction models, known as subnets, which share the same architecture and are trained with both cross-correction loss and trajectory prediction loss. Cross-correction leverages the learning from both subnets and enables them to refine their underlying representations of trajectories through a mutual correction mechanism. Specifically, we use the cross-correction loss to learn how to correct each other through an inter-subnet interaction. To induce diverse learning among the subnets, we use the transformed observed trajectories produced by a neural network as input to one subnet and the original observed trajectories as input to the other subnet. We utilize transformer-based encoder-decoder architecture for each subnet to capture motion and social interaction among pedestrians. The encoder of the transformer captures motion patterns in trajectories, while the decoder focuses on pedestrian interactions with neighbors. Each subnet performs the primary task of predicting future trajectories (a regression task) along with the secondary task of classifying the predicted trajectories (a classification task). Extensive experiments on real-world benchmark datasets such as ETH-UCY and SDD demonstrate the efficacy of our proposed framework, CCF, in precisely predicting pedestrian future trajectories. We also conducted several ablation experiments to demonstrate the effectiveness of various modules and loss functions used in our approach.
Related papers
- Cross-Domain Transfer Learning using Attention Latent Features for Multi-Agent Trajectory Prediction [4.292918274985369]
We propose a novel spatial-temporal trajectory prediction framework that performs cross-domain adaption on the attention representation of a Transformer-based model.
A graph convolutional network is also integrated to construct dynamic graph feature embeddings that accurately model the complex spatial-temporal interactions between the multi-agent vehicles.
arXiv Detail & Related papers (2024-11-09T06:39:44Z) - Dual-Path Adversarial Lifting for Domain Shift Correction in Online Test-time Adaptation [59.18151483767509]
We introduce a dual-path token lifting for domain shift correction in test time adaptation.
We then perform dual-path lifting with interleaved token prediction and update between the path of domain shift tokens and the path of class tokens.
Experimental results on the benchmark datasets demonstrate that our proposed method significantly improves the online fully test-time domain adaptation performance.
arXiv Detail & Related papers (2024-08-26T02:33:47Z) - Knowledge-aware Graph Transformer for Pedestrian Trajectory Prediction [15.454206825258169]
Predicting pedestrian motion trajectories is crucial for path planning and motion control of autonomous vehicles.
Recent deep learning-based prediction approaches mainly utilize information like trajectory history and interactions between pedestrians.
This paper proposes a graph transformer structure to improve prediction performance.
arXiv Detail & Related papers (2024-01-10T01:50:29Z) - PedFormer: Pedestrian Behavior Prediction via Cross-Modal Attention
Modulation and Gated Multitask Learning [10.812772606528172]
We propose a novel framework that relies on different data modalities to predict future trajectories and crossing actions of pedestrians from an ego-centric perspective.
We show that our model improves state-of-the-art in trajectory and action prediction by up to 22% and 13% respectively on various metrics.
arXiv Detail & Related papers (2022-10-14T15:12:00Z) - Adaptive Trajectory Prediction via Transferable GNN [74.09424229172781]
We propose a novel Transferable Graph Neural Network (T-GNN) framework, which jointly conducts trajectory prediction as well as domain alignment in a unified framework.
Specifically, a domain invariant GNN is proposed to explore the structural motion knowledge where the domain specific knowledge is reduced.
An attention-based adaptive knowledge learning module is further proposed to explore fine-grained individual-level feature representation for knowledge transfer.
arXiv Detail & Related papers (2022-03-09T21:08:47Z) - Pedestrian Trajectory Prediction via Spatial Interaction Transformer
Network [7.150832716115448]
In traffic scenes, when encountering with oncoming people, pedestrians may make sudden turns or stop immediately.
To predict such unpredictable trajectories, we can gain insights into the interaction between pedestrians.
We present a novel generative method named Spatial Interaction Transformer (SIT), which learns the correlation of pedestrian trajectories through attention mechanisms.
arXiv Detail & Related papers (2021-12-13T13:08:04Z) - SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory
Prediction [64.16212996247943]
We present a Sparse Graph Convolution Network(SGCN) for pedestrian trajectory prediction.
Specifically, the SGCN explicitly models the sparse directed interaction with a sparse directed spatial graph to capture adaptive interaction pedestrians.
visualizations indicate that our method can capture adaptive interactions between pedestrians and their effective motion tendencies.
arXiv Detail & Related papers (2021-04-04T03:17:42Z) - Congestion-aware Multi-agent Trajectory Prediction for Collision
Avoidance [110.63037190641414]
We propose to learn congestion patterns explicitly and devise a novel "Sense--Learn--Reason--Predict" framework.
By decomposing the learning phases into two stages, a "student" can learn contextual cues from a "teacher" while generating collision-free trajectories.
In experiments, we demonstrate that the proposed model is able to generate collision-free trajectory predictions in a synthetic dataset.
arXiv Detail & Related papers (2021-03-26T02:42:33Z) - Domain Adaptive Robotic Gesture Recognition with Unsupervised
Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot.
It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture.
Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z) - End-to-end Contextual Perception and Prediction with Interaction
Transformer [79.14001602890417]
We tackle the problem of detecting objects in 3D and forecasting their future motion in the context of self-driving.
To capture their spatial-temporal dependencies, we propose a recurrent neural network with a novel Transformer architecture.
Our model can be trained end-to-end, and runs in real-time.
arXiv Detail & Related papers (2020-08-13T14:30:12Z) - AMENet: Attentive Maps Encoder Network for Trajectory Prediction [35.22312783822563]
Trajectory prediction is critical for applications of planning safe future movements.
We propose an end-to-end generative model named Attentive Maps Network (AMENet)
AMENet encodes the agent's motion and interaction information for accurate and realistic multi-path trajectory prediction.
arXiv Detail & Related papers (2020-06-15T10:00:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.