Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with
Masked Autoencoders
- URL: http://arxiv.org/abs/2308.09882v1
- Date: Sat, 19 Aug 2023 02:27:51 GMT
- Title: Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with
Masked Autoencoders
- Authors: Jie Cheng, Xiaodong Mei and Ming Liu
- Abstract summary: This study explores the application of self-supervised learning to the task of motion forecasting.
Forecast-MAE is an extension of the mask autoencoders framework that is specifically designed for self-supervised learning of the motion forecasting task.
- Score: 7.133110402648305
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study explores the application of self-supervised learning (SSL) to the
task of motion forecasting, an area that has not yet been extensively
investigated despite the widespread success of SSL in computer vision and
natural language processing. To address this gap, we introduce Forecast-MAE, an
extension of the mask autoencoders framework that is specifically designed for
self-supervised learning of the motion forecasting task. Our approach includes
a novel masking strategy that leverages the strong interconnections between
agents' trajectories and road networks, involving complementary masking of
agents' future or history trajectories and random masking of lane segments. Our
experiments on the challenging Argoverse 2 motion forecasting benchmark show
that Forecast-MAE, which utilizes standard Transformer blocks with minimal
inductive bias, achieves competitive performance compared to state-of-the-art
methods that rely on supervised learning and sophisticated designs. Moreover,
it outperforms the previous self-supervised learning method by a significant
margin. Code is available at https://github.com/jchengai/forecast-mae.
Related papers
- Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models [12.687494201105066]
This paper proposes Traj-LLM, the first to investigate the potential of using Large Language Models (LLMs) to generate future motion from agents' past/observed trajectories and scene semantics.
LLMs' powerful comprehension abilities capture a spectrum of high-level scene knowledge and interactive information.
Emulating the human-like lane focus cognitive function, we introduce lane-aware probabilistic learning powered by the pioneering Mamba module.
arXiv Detail & Related papers (2024-05-08T09:28:04Z) - GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks [24.323017830938394]
This work aims to address challenges by introducing a pre-training framework that seamlessly integrates with baselines and enhances their performance.
The framework is built upon two key designs: (i) We propose a.
apple-to-apple mask autoencoder as a pre-training model for learning-temporal dependencies.
These modules are specifically designed to capture intra-temporal customized representations and semantic- and inter-cluster relationships.
arXiv Detail & Related papers (2023-11-07T02:36:24Z) - Understanding Masked Autoencoders From a Local Contrastive Perspective [80.57196495601826]
Masked AutoEncoder (MAE) has revolutionized the field of self-supervised learning with its simple yet effective masking and reconstruction strategies.
We introduce a new empirical framework, called Local Contrastive MAE, to analyze both reconstructive and contrastive aspects of MAE.
arXiv Detail & Related papers (2023-10-03T12:08:15Z) - Traj-MAE: Masked Autoencoders for Trajectory Prediction [69.7885837428344]
Trajectory prediction has been a crucial task in building a reliable autonomous driving system by anticipating possible dangers.
We propose an efficient masked autoencoder for trajectory prediction (Traj-MAE) that better represents the complicated behaviors of agents in the driving environment.
Our experimental results in both multi-agent and single-agent settings demonstrate that Traj-MAE achieves competitive results with state-of-the-art methods.
arXiv Detail & Related papers (2023-03-12T16:23:27Z) - Masked Autoencoding for Scalable and Generalizable Decision Making [93.84855114717062]
MaskDP is a simple and scalable self-supervised pretraining method for reinforcement learning and behavioral cloning.
We find that a MaskDP model gains the capability of zero-shot transfer to new BC tasks, such as single and multiple goal reaching.
arXiv Detail & Related papers (2022-11-23T07:04:41Z) - Exploring The Role of Mean Teachers in Self-supervised Masked
Auto-Encoders [64.03000385267339]
Masked image modeling (MIM) has become a popular strategy for self-supervised learning(SSL) of visual representations with Vision Transformers.
We present a simple SSL method, the Reconstruction-Consistent Masked Auto-Encoder (RC-MAE) by adding an EMA teacher to MAE.
RC-MAE converges faster and requires less memory usage than state-of-the-art self-distillation methods during pre-training.
arXiv Detail & Related papers (2022-10-05T08:08:55Z) - A Survey on Masked Autoencoder for Self-supervised Learning in Vision
and Beyond [64.85076239939336]
Self-supervised learning (SSL) in vision might undertake a similar trajectory as in NLP.
generative pretext tasks with the masked prediction (e.g., BERT) have become a de facto standard SSL practice in NLP.
Success of mask image modeling has revived the masking autoencoder.
arXiv Detail & Related papers (2022-07-30T09:59:28Z) - SSL-Lanes: Self-Supervised Learning for Motion Forecasting in Autonomous
Driving [9.702784248870522]
Self-supervised learning (SSL) is an emerging technique to train convolutional neural networks (CNNs) and graph neural networks (GNNs)
In this study, we report the first systematic exploration of incorporating self-supervision into motion forecasting.
arXiv Detail & Related papers (2022-06-28T16:23:25Z) - Bootstrap Motion Forecasting With Self-Consistent Constraints [52.88100002373369]
We present a novel framework to bootstrap Motion forecasting with Self-consistent Constraints.
The motion forecasting task aims at predicting future trajectories of vehicles by incorporating spatial and temporal information from the past.
We show that our proposed scheme consistently improves the prediction performance of several existing methods.
arXiv Detail & Related papers (2022-04-12T14:59:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.