MUSE-VAE: Multi-Scale VAE for Environment-Aware Long Term Trajectory
Prediction
- URL: http://arxiv.org/abs/2201.07189v1
- Date: Tue, 18 Jan 2022 18:40:03 GMT
- Title: MUSE-VAE: Multi-Scale VAE for Environment-Aware Long Term Trajectory
Prediction
- Authors: Mihee Lee, Samuel S. Sohn, Seonghyeon Moon, Sejong Yoon, Mubbasir
Kapadia, Vladimir Pavlovic
- Abstract summary: Conditional MUSE offers diverse and simultaneously more accurate predictions compared to the current state-of-the-art.
We demonstrate these assertions through a comprehensive set of experiments on nuScenes and SDD benchmarks as well as PFSD, a new synthetic dataset.
- Score: 28.438787700968703
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Accurate long-term trajectory prediction in complex scenes, where multiple
agents (e.g., pedestrians or vehicles) interact with each other and the
environment while attempting to accomplish diverse and often unknown goals, is
a challenging stochastic forecasting problem. In this work, we propose MUSE, a
new probabilistic modeling framework based on a cascade of Conditional VAEs,
which tackles the long-term, uncertain trajectory prediction task using a
coarse-to-fine multi-factor forecasting architecture. In its Macro stage, the
model learns a joint pixel-space representation of two key factors, the
underlying environment and the agent movements, to predict the long and
short-term motion goals. Conditioned on them, the Micro stage learns a
fine-grained spatio-temporal representation for the prediction of individual
agent trajectories. The VAE backbones across the two stages make it possible to
naturally account for the joint uncertainty at both levels of granularity. As a
result, MUSE offers diverse and simultaneously more accurate predictions
compared to the current state-of-the-art. We demonstrate these assertions
through a comprehensive set of experiments on nuScenes and SDD benchmarks as
well as PFSD, a new synthetic dataset, which challenges the forecasting ability
of models on complex agent-environment interaction scenarios.
Related papers
- Multi-agent Long-term 3D Human Pose Forecasting via Interaction-aware Trajectory Conditioning [41.09061877498741]
We propose an interaction-aware trajectory-conditioned long-term multi-agent human pose forecasting model.
Our model effectively handles the multi-modality of human motion and the complexity of long-term multi-agent interactions.
arXiv Detail & Related papers (2024-04-08T06:15:13Z) - AMP: Autoregressive Motion Prediction Revisited with Next Token Prediction for Autonomous Driving [59.94343412438211]
We introduce the GPT style next token motion prediction into motion prediction.
Different from language data which is composed of homogeneous units -words, the elements in the driving scene could have complex spatial-temporal and semantic relations.
We propose to adopt three factorized attention modules with different neighbors for information aggregation and different position encoding styles to capture their relations.
arXiv Detail & Related papers (2024-03-20T06:22:37Z) - A Hierarchical Hybrid Learning Framework for Multi-agent Trajectory
Prediction [4.181632607997678]
We propose a hierarchical hybrid framework of deep learning (DL) and reinforcement learning (RL) for multi-agent trajectory prediction.
In the DL stage, the traffic scene is divided into multiple intermediate-scale heterogenous graphs based on which Transformer-style GNNs are adopted to encode heterogenous interactions.
In the RL stage, we divide the traffic scene into local sub-scenes utilizing the key future points predicted in the DL stage.
arXiv Detail & Related papers (2023-03-22T02:47:42Z) - Multi-modal anticipation of stochastic trajectories in a dynamic
environment with Conditional Variational Autoencoders [0.12183405753834559]
Short-term motion of nearby vehicles is not strictly limited to a set of single trajectories.
We propose to account for the multi-modality of the problem with use of Conditional Conditional Autoencoder (C-VAE) conditioned on an agent's past motion as well as a scene encoded with Capsule Network (CapsNet)
In addition, we demonstrate advantages of employing the Minimum over N generated samples and tries to minimise the loss with respect to the closest sample, effectively leading to more diverse predictions.
arXiv Detail & Related papers (2021-03-05T19:38:26Z) - From Goals, Waypoints & Paths To Long Term Human Trajectory Forecasting [54.273455592965355]
Uncertainty in future trajectories stems from two sources: (a) sources known to the agent but unknown to the model, such as long term goals and (b)sources that are unknown to both the agent & the model, such as intent of other agents & irreducible randomness indecisions.
We model the epistemic un-certainty through multimodality in long term goals and the aleatoric uncertainty through multimodality in waypoints& paths.
To exemplify this dichotomy, we also propose a novel long term trajectory forecasting setting, with prediction horizons upto a minute, an order of magnitude longer than prior works.
arXiv Detail & Related papers (2020-12-02T21:01:29Z) - SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction.
multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z) - SSP: Single Shot Future Trajectory Prediction [26.18589883075203]
We propose a robust solution to future trajectory forecast, which can be practically applicable to autonomous agents in highly crowded environments.
First, we use composite fields to predict future locations of all road agents in a single-shot, which results in a constant time.
Second, interactions between agents are modeled as non-local, response enabling spatial relationships between different locations to be captured temporally.
Third, the semantic context of the scene are modeled and take into account the environmental constraints that potentially influence the future motion.
arXiv Detail & Related papers (2020-04-13T09:56:38Z) - A Spatial-Temporal Attentive Network with Spatial Continuity for
Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC)
First, spatial-temporal attention mechanism is presented to explore the most useful and important information.
Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z) - Ambiguity in Sequential Data: Predicting Uncertain Futures with
Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data.
We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z) - Transformer Hawkes Process [79.16290557505211]
We propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long-term dependencies.
THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin.
We provide a concrete example, where THP achieves improved prediction performance for learning multiple point processes when incorporating their relational information.
arXiv Detail & Related papers (2020-02-21T13:48:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.