T4P: Test-Time Training of Trajectory Prediction via Masked Autoencoder and Actor-specific Token Memory
- URL: http://arxiv.org/abs/2403.10052v1
- Date: Fri, 15 Mar 2024 06:47:14 GMT
- Title: T4P: Test-Time Training of Trajectory Prediction via Masked Autoencoder and Actor-specific Token Memory
- Authors: Daehee Park, Jaeseok Jeong, Sung-Hoon Yoon, Jaewoo Jeong, Kuk-Jin Yoon,
- Abstract summary: Trajectory prediction is a challenging problem that requires considering interactions among multiple actors.
Data-driven approaches have been used to address this complex problem, but they suffer from unreliable predictions under distribution shifts during test time.
We propose several online learning methods using regression loss from the ground truth of observed data.
Our method surpasses the performance of existing state-of-the-art online learning methods in terms of both prediction accuracy and computational efficiency.
- Score: 39.021321011792786
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Trajectory prediction is a challenging problem that requires considering interactions among multiple actors and the surrounding environment. While data-driven approaches have been used to address this complex problem, they suffer from unreliable predictions under distribution shifts during test time. Accordingly, several online learning methods have been proposed using regression loss from the ground truth of observed data leveraging the auto-labeling nature of trajectory prediction task. We mainly tackle the following two issues. First, previous works underfit and overfit as they only optimize the last layer of the motion decoder. To this end, we employ the masked autoencoder (MAE) for representation learning to encourage complex interaction modeling in shifted test distribution for updating deeper layers. Second, utilizing the sequential nature of driving data, we propose an actor-specific token memory that enables the test-time learning of actor-wise motion characteristics. Our proposed method has been validated across various challenging cross-dataset distribution shift scenarios including nuScenes, Lyft, Waymo, and Interaction. Our method surpasses the performance of existing state-of-the-art online learning methods in terms of both prediction accuracy and computational efficiency. The code is available at https://github.com/daeheepark/T4P.
Related papers
- Dual-Path Adversarial Lifting for Domain Shift Correction in Online Test-time Adaptation [59.18151483767509]
We introduce a dual-path token lifting for domain shift correction in test time adaptation.
We then perform dual-path lifting with interleaved token prediction and update between the path of domain shift tokens and the path of class tokens.
Experimental results on the benchmark datasets demonstrate that our proposed method significantly improves the online fully test-time domain adaptation performance.
arXiv Detail & Related papers (2024-08-26T02:33:47Z) - Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent [53.637837706712794]
We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs.
Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction.
We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
arXiv Detail & Related papers (2024-05-27T22:15:23Z) - Traj-MAE: Masked Autoencoders for Trajectory Prediction [69.7885837428344]
Trajectory prediction has been a crucial task in building a reliable autonomous driving system by anticipating possible dangers.
We propose an efficient masked autoencoder for trajectory prediction (Traj-MAE) that better represents the complicated behaviors of agents in the driving environment.
Our experimental results in both multi-agent and single-agent settings demonstrate that Traj-MAE achieves competitive results with state-of-the-art methods.
arXiv Detail & Related papers (2023-03-12T16:23:27Z) - PreTR: Spatio-Temporal Non-Autoregressive Trajectory Prediction
Transformer [0.9786690381850356]
We introduce a model called PRediction Transformer (PReTR) that extracts features from the multi-agent scenes by employing a factorized-temporal attention module.
It shows less computational needs than previously studied models with empirically better results.
We leverage encoder-decoder Transformer networks for parallel decoding a set of learned object queries.
arXiv Detail & Related papers (2022-03-17T12:52:23Z) - Adaptive Online Incremental Learning for Evolving Data Streams [4.3386084277869505]
The first major difficulty is concept drift, that is, the probability distribution in the streaming data would change as the data arrives.
The second major difficulty is catastrophic forgetting, that is, forgetting what we have learned before when learning new knowledge.
Our research builds on this observation and attempts to overcome these difficulties.
arXiv Detail & Related papers (2022-01-05T14:25:53Z) - Injecting Knowledge in Data-driven Vehicle Trajectory Predictors [82.91398970736391]
Vehicle trajectory prediction tasks have been commonly tackled from two perspectives: knowledge-driven or data-driven.
In this paper, we propose to learn a "Realistic Residual Block" (RRB) which effectively connects these two perspectives.
Our proposed method outputs realistic predictions by confining the residual range and taking into account its uncertainty.
arXiv Detail & Related papers (2021-03-08T16:03:09Z) - SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction.
multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z) - Nonlinear Traffic Prediction as a Matrix Completion Problem with
Ensemble Learning [1.8352113484137629]
This paper addresses the problem of short-term traffic prediction for signalized traffic operations management.
We focus on predicting sensor states in high-resolution (second-by-second)
Our contributions can be summarized as offering three insights.
arXiv Detail & Related papers (2020-01-08T13:10:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.