Related papers: Trajectory Forecasting through Low-Rank Adaptation of Discrete Latent Codes

Trajectory Forecasting through Low-Rank Adaptation of Discrete Latent Codes

URL: http://arxiv.org/abs/2405.20743v2
Date: Thu, 29 Aug 2024 15:31:58 GMT
Title: Trajectory Forecasting through Low-Rank Adaptation of Discrete Latent Codes
Authors: Riccardo Benaglia, Angelo Porrello, Pietro Buzzega, Simone Calderara, Rita Cucchiara,
Abstract summary: Trajectory forecasting is crucial for video surveillance analytics, as it enables the anticipation of future movements for a set of agents. We introduce Vector Quantized Variational Autoencoders (VQ-VAEs), which utilize a discrete latent space to tackle the issue of posterior collapse. We show that such a two-fold framework, augmented with instance-level discretization, leads to accurate and diverse forecasts.
Score: 36.12653178844828
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Trajectory forecasting is crucial for video surveillance analytics, as it enables the anticipation of future movements for a set of agents, e.g. basketball players engaged in intricate interactions with long-term intentions. Deep generative models offer a natural learning approach for trajectory forecasting, yet they encounter difficulties in achieving an optimal balance between sampling fidelity and diversity. We address this challenge by leveraging Vector Quantized Variational Autoencoders (VQ-VAEs), which utilize a discrete latent space to tackle the issue of posterior collapse. Specifically, we introduce an instance-based codebook that allows tailored latent representations for each example. In a nutshell, the rows of the codebook are dynamically adjusted to reflect contextual information (i.e., past motion patterns extracted from the observed trajectories). In this way, the discretization process gains flexibility, leading to improved reconstructions. Notably, instance-level dynamics are injected into the codebook through low-rank updates, which restrict the customization of the codebook to a lower dimension space. The resulting discrete space serves as the basis of the subsequent step, which regards the training of a diffusion-based predictive model. We show that such a two-fold framework, augmented with instance-level discretization, leads to accurate and diverse forecasts, yielding state-of-the-art performance on three established benchmarks.

Related papers

Self-supervised Latent Space Optimization with Nebula Variational Coding [87.20343320266215]
This paper proposes a variational inference model which leads to a clustered embedding.<n>We introduce additional variables in the latent space, called textbfnebula anchors, that guide the latent variables to form clusters during training.<n>Since each latent feature can be labeled with the closest anchor, we also propose to apply metric learning in a self-supervised way to make the separation between clusters more explicit.
arXiv Detail & Related papers (2025-06-02T08:13:32Z)
Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence [11.400431211239958]
Diffusion models have emerged as powerful tools for generative modeling. We propose a control framework for fine-tuning diffusion models. We show that PI-FT achieves global convergence at a linear rate.
arXiv Detail & Related papers (2024-12-24T04:55:46Z)
Gaussian Mixture Vector Quantization with Aggregated Categorical Posterior [5.862123282894087]
We introduce the Vector Quantized Variational Autoencoder (VQ-VAE) VQ-VAE is a type of variational autoencoder using discrete embedding as latent. We show that GM-VQ improves codebook utilization and reduces information loss without relying on handcrafteds.
arXiv Detail & Related papers (2024-10-14T05:58:11Z)
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion [61.03681839276652]
Diffusion Forcing is a new training paradigm where a diffusion model is trained to denoise a set of tokens with independent per-token noise levels. We apply Diffusion Forcing to sequence generative modeling by training a causal next-token prediction model to generate one or several future tokens.
arXiv Detail & Related papers (2024-07-01T15:43:25Z)
Variational quantization for state space models [3.9762742923544456]
forecasting tasks using large datasets gathering thousands of heterogeneous time series is a crucial statistical problem in numerous sectors. We propose a new forecasting model that combines discrete state space hidden Markov models with recent neural network architectures and training procedures inspired by vector quantized variational autoencoders. We assess the performance of the proposed method using several datasets and show that it outperforms other state-of-the-art solutions.
arXiv Detail & Related papers (2024-04-17T07:01:41Z)
Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs) GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations. We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z)
Regularized Vector Quantization for Tokenized Image Synthesis [126.96880843754066]
Quantizing images into discrete representations has been a fundamental problem in unified generative modeling. deterministic quantization suffers from severe codebook collapse and misalignment with inference stage while quantization suffers from low codebook utilization and reconstruction objective. This paper presents a regularized vector quantization framework that allows to mitigate perturbed above issues effectively by applying regularization from two perspectives.
arXiv Detail & Related papers (2023-03-11T15:20:54Z)
Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization [76.68866368409216]
We propose learning to dynamically select discretization tightness conditioned on inputs. We show that dynamically varying tightness in communication bottlenecks can improve model performance on visual reasoning and reinforcement learning tasks.
arXiv Detail & Related papers (2022-02-02T23:54:26Z)
Differentiable Generalised Predictive Coding [2.868176771215219]
This paper deals with differentiable dynamical models congruent with neural process theories that cast brain function as the hierarchical refinement of an internal generative model explaining observations. Our work extends existing implementations of gradient-based predictive coding and allows to integrate deep neural networks for non-linear state parameterization.
arXiv Detail & Related papers (2021-12-02T22:02:56Z)
Contrastive Self-supervised Sequential Recommendation with Robust Augmentation [101.25762166231904]
Sequential Recommendationdescribes a set of techniques to model dynamic user behavior in order to predict future interactions in sequential user data. Old and new issues remain, including data-sparsity and noisy data. We propose Contrastive Self-Supervised Learning for sequential Recommendation (CoSeRec)
arXiv Detail & Related papers (2021-08-14T07:15:25Z)
Learning Consistent Deep Generative Models from Sparse Data via Prediction Constraints [16.48824312904122]
We develop a new framework for learning variational autoencoders and other deep generative models. We show that these two contributions -- prediction constraints and consistency constraints -- lead to promising image classification performance.
arXiv Detail & Related papers (2020-12-12T04:18:50Z)
Unsupervised Controllable Generation with Self-Training [90.04287577605723]
controllable generation with GANs remains a challenging research problem. We propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training. Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder.
arXiv Detail & Related papers (2020-07-17T21:50:35Z)
Elastic Consistency: A General Consistency Model for Distributed Stochastic Gradient Descent [28.006781039853575]
A key element behind the progress of machine learning in recent years has been the ability to train machine learning models in largescale distributed-memory environments. In this paper, we introduce general convergence methods used in practice to train large-scale machine learning models. Our framework, called elastic elastic bounds, enables us to derive convergence bounds for a variety of distributed SGD methods.
arXiv Detail & Related papers (2020-01-16T16:10:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.