Multi-modal anticipation of stochastic trajectories in a dynamic
environment with Conditional Variational Autoencoders
- URL: http://arxiv.org/abs/2103.03912v1
- Date: Fri, 5 Mar 2021 19:38:26 GMT
- Title: Multi-modal anticipation of stochastic trajectories in a dynamic
environment with Conditional Variational Autoencoders
- Authors: Albert Dulian, John C. Murray
- Abstract summary: Short-term motion of nearby vehicles is not strictly limited to a set of single trajectories.
We propose to account for the multi-modality of the problem with use of Conditional Conditional Autoencoder (C-VAE) conditioned on an agent's past motion as well as a scene encoded with Capsule Network (CapsNet)
In addition, we demonstrate advantages of employing the Minimum over N generated samples and tries to minimise the loss with respect to the closest sample, effectively leading to more diverse predictions.
- Score: 0.12183405753834559
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Forecasting short-term motion of nearby vehicles presents an inherently
challenging issue as the space of their possible future movements is not
strictly limited to a set of single trajectories. Recently proposed techniques
that demonstrate plausible results concentrate primarily on forecasting a fixed
number of deterministic predictions, or on classifying over a wide variety of
trajectories that were previously generated using e.g. dynamic model. This
paper focuses on addressing the uncertainty associated with the discussed task
by utilising the stochastic nature of generative models in order to produce a
diverse set of plausible paths with regards to tracked vehicles. More
specifically, we propose to account for the multi-modality of the problem with
use of Conditional Variational Autoencoder (C-VAE) conditioned on an agent's
past motion as well as a rasterised scene context encoded with Capsule Network
(CapsNet). In addition, we demonstrate advantages of employing the Minimum over
N (MoN) cost function which measures the distance between ground truth and N
generated samples and tries to minimise the loss with respect to the closest
sample, effectively leading to more diverse predictions. We examine our network
on a publicly available dataset against recent state-of-the-art methods and
show that our approach outperforms these techniques in numerous scenarios
whilst significantly reducing the number of trainable parameters as well as
allowing to sample an arbitrary amount of diverse trajectories.
Related papers
- Motion Forecasting via Model-Based Risk Minimization [8.766024024417316]
We propose a novel sampling method applicable to trajectory prediction based on the predictions of multiple models.
We first show that conventional sampling based on predicted probabilities can degrade performance due to missing alignment between models.
By using state-of-the-art models as base learners, our approach constructs diverse and effective ensembles for optimal trajectory sampling.
arXiv Detail & Related papers (2024-09-16T09:03:28Z) - Controllable Diverse Sampling for Diffusion Based Motion Behavior
Forecasting [11.106812447960186]
We introduce a novel trajectory generator named Controllable Diffusion Trajectory (CDT)
CDT integrates information and social interactions into a Transformer-based conditional denoising diffusion model to guide the prediction of future trajectories.
To ensure multimodality, we incorporate behavioral tokens to direct the trajectory's modes, such as going straight, turning right or left.
arXiv Detail & Related papers (2024-02-06T13:16:54Z) - DICE: Diverse Diffusion Model with Scoring for Trajectory Prediction [7.346307332191997]
We present a novel framework that leverages diffusion models for predicting future trajectories in a computationally efficient manner.
We employ an efficient sampling mechanism that allows us to maximize the number of sampled trajectories for improved accuracy.
We show the effectiveness of our approach by conducting empirical evaluations on common pedestrian (UCY/ETH) and autonomous driving (nuScenes) benchmark datasets.
arXiv Detail & Related papers (2023-10-23T05:04:23Z) - DeNoising-MOT: Towards Multiple Object Tracking with Severe Occlusions [52.63323657077447]
We propose DNMOT, an end-to-end trainable DeNoising Transformer for multiple object tracking.
Specifically, we augment the trajectory with noises during training and make our model learn the denoising process in an encoder-decoder architecture.
We conduct extensive experiments on the MOT17, MOT20, and DanceTrack datasets, and the experimental results show that our method outperforms previous state-of-the-art methods by a clear margin.
arXiv Detail & Related papers (2023-09-09T04:40:01Z) - Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion [88.45326906116165]
We present a new framework to formulate the trajectory prediction task as a reverse process of motion indeterminacy diffusion (MID)
We encode the history behavior information and the social interactions as a state embedding and devise a Transformer-based diffusion model to capture the temporal dependencies of trajectories.
Experiments on the human trajectory prediction benchmarks including the Stanford Drone and ETH/UCY datasets demonstrate the superiority of our method.
arXiv Detail & Related papers (2022-03-25T16:59:08Z) - SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction.
multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z) - Ambiguity in Sequential Data: Predicting Uncertain Futures with
Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data.
We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z) - Diverse and Admissible Trajectory Forecasting through Multimodal Context
Understanding [46.52703817997932]
Multi-agent trajectory forecasting in autonomous driving requires an agent to accurately anticipate the behaviors of the surrounding vehicles and pedestrians.
We propose a model that synthesizes multiple input signals from the multimodal world.
We show a significant performance improvement over previous state-of-the-art methods.
arXiv Detail & Related papers (2020-03-06T13:59:39Z) - A Multi-Channel Neural Graphical Event Model with Negative Evidence [76.51278722190607]
Event datasets are sequences of events of various types occurring irregularly over the time-line.
We propose a non-parametric deep neural network approach in order to estimate the underlying intensity functions.
arXiv Detail & Related papers (2020-02-21T23:10:50Z) - MCENET: Multi-Context Encoder Network for Homogeneous Agent Trajectory
Prediction in Mixed Traffic [35.22312783822563]
Trajectory prediction in urban mixedtraffic zones is critical for many intelligent transportation systems.
We propose an approach named Multi-Context Network (MCENET) that is trained by encoding both past and future scene context.
In inference time, we combine the past context and motion information of the target agent with samplings of the latent variables to predict multiple realistic trajectories.
arXiv Detail & Related papers (2020-02-14T11:04:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.