Related papers: A Mixture of Experts Approach to 3D Human Motion Prediction

A Mixture of Experts Approach to 3D Human Motion Prediction

URL: http://arxiv.org/abs/2405.06088v1
Date: Thu, 9 May 2024 20:26:58 GMT
Title: A Mixture of Experts Approach to 3D Human Motion Prediction
Authors: Edmund Shieh, Joshua Lee Franco, Kang Min Bae, Tej Lalvani,
Abstract summary: This project addresses the challenge of human motion prediction, a critical area for applications such as au- tonomous vehicle movement detection. Our primary objective is to critically evaluate existing model ar-tectures, identifying their advantages and opportunities for improvement. The particular variation that is used is Soft MoE, a fully-differentiable sparse Transformer that has shown promising ability to enable larger model capacity at lower inference cost.
Score: 1.4974445469089412
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This project addresses the challenge of human motion prediction, a critical area for applications such as au- tonomous vehicle movement detection. Previous works have emphasized the need for low inference times to provide real time performance for applications like these. Our primary objective is to critically evaluate existing model ar- chitectures, identifying their advantages and opportunities for improvement by replicating the state-of-the-art (SOTA) Spatio-Temporal Transformer model as best as possible given computational con- straints. These models have surpassed the limitations of RNN-based models and have demonstrated the ability to generate plausible motion sequences over both short and long term horizons through the use of spatio-temporal rep- resentations. We also propose a novel architecture to ad- dress challenges of real time inference speed by incorpo- rating a Mixture of Experts (MoE) block within the Spatial- Temporal (ST) attention layer. The particular variation that is used is Soft MoE, a fully-differentiable sparse Transformer that has shown promising ability to enable larger model capacity at lower inference cost. We make out code publicly available at https://github.com/edshieh/motionprediction

Related papers

Robust Traffic Forecasting against Spatial Shift over Years [11.208740750755025]
We investigate state-temporal-the-art models using newly proposed traffic OOD benchmarks. We find that these models experience significant decline in performance. We propose a novel of Mixture Experts framework, which learns a set of graph generators during training and combines them to generate new graphs. Our method is both parsimonious and efficacious, and can be seamlessly integrated into anytemporal model.
arXiv Detail & Related papers (2024-10-01T03:49:29Z)
MambaVT: Spatio-Temporal Contextual Modeling for robust RGB-T Tracking [51.28485682954006]
We propose a pure Mamba-based framework (MambaVT) to fully exploit intrinsic-temporal contextual modeling for robust visible-thermal tracking. Specifically, we devise the long-range cross-frame integration component to globally adapt to target appearance variations. Experiments show the significant potential of vision Mamba for RGB-T tracking, with MambaVT achieving state-of-the-art performance on four mainstream benchmarks.
arXiv Detail & Related papers (2024-08-15T02:29:00Z)
Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent [53.637837706712794]
We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs. Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction. We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
arXiv Detail & Related papers (2024-05-27T22:15:23Z)
A conditional latent autoregressive recurrent model for generation and forecasting of beam dynamics in particle accelerators [46.348283638884425]
We propose a two-step unsupervised deep learning framework named as Latent Autoregressive Recurrent Model (CLARM) for learning dynamics of charged particles in accelerators. The CLARM can generate projections at various accelerator sampling modules by capturing and decoding the latent space representation. The results demonstrate that the generative and forecasting ability of the proposed approach is promising when tested against a variety of evaluation metrics.
arXiv Detail & Related papers (2024-03-19T22:05:17Z)
Learning Robust Precipitation Forecaster by Temporal Frame Interpolation [65.5045412005064]
We develop a robust precipitation forecasting model that demonstrates resilience against spatial-temporal discrepancies. Our approach has led to significant improvements in forecasting precision, culminating in our model securing textit1st place in the transfer learning leaderboard of the textitWeather4cast'23 competition.
arXiv Detail & Related papers (2023-11-30T08:22:08Z)
Koopman Invertible Autoencoder: Leveraging Forward and Backward Dynamics for Temporal Modeling [13.38194491846739]
We propose a novel machine learning model based on Koopman operator theory, which we call Koopman Invertible Autoencoders (KIA) KIA captures the inherent characteristic of the system by modeling both forward and backward dynamics in the infinite-dimensional Hilbert space. This enables us to efficiently learn low-dimensional representations, resulting in more accurate predictions of long-term system behavior.
arXiv Detail & Related papers (2023-09-19T03:42:55Z)
OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning [67.07363529640784]
We propose OpenSTL to categorize prevalent approaches into recurrent-based and recurrent-free models. We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and forecasting weather. We find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models.
arXiv Detail & Related papers (2023-06-20T03:02:14Z)
Uncovering the Missing Pattern: Unified Framework Towards Trajectory Imputation and Prediction [60.60223171143206]
Trajectory prediction is a crucial undertaking in understanding entity movement or human behavior from observed sequences. Current methods often assume that the observed sequences are complete while ignoring the potential for missing values. This paper presents a unified framework, the Graph-based Conditional Variational Recurrent Neural Network (GC-VRNN), which can perform trajectory imputation and prediction simultaneously.
arXiv Detail & Related papers (2023-03-28T14:27:27Z)
SPOTR: Spatio-temporal Pose Transformers for Human Motion Prediction [12.248428883804763]
3D human motion prediction is a research area computation of high significance and a challenge in computer vision. Traditionally, autogregressive models have been used to predict human motion. We present a non-autoregressive model for human motion prediction.
arXiv Detail & Related papers (2023-03-11T01:44:29Z)
Anomaly Detection of Time Series with Smoothness-Inducing Sequential Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series. Our model parameterizes mean and variance for each time-stamp with flexible neural networks. We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z)
Motion Prediction Using Temporal Inception Module [96.76721173517895]
We propose a Temporal Inception Module (TIM) to encode human motion. Our framework produces input embeddings using convolutional layers, by using different kernel sizes for different input lengths. The experimental results on standard motion prediction benchmark datasets Human3.6M and CMU motion capture dataset show that our approach consistently outperforms the state of the art methods.
arXiv Detail & Related papers (2020-10-06T20:26:01Z)
A Spatio-temporal Transformer for 3D Human Motion Prediction [39.31212055504893]
We propose a Transformer-based architecture for the task of generative modelling of 3D human motion. We empirically show that this effectively learns the underlying motion dynamics and reduces error accumulation over time observed in auto-gressive models.
arXiv Detail & Related papers (2020-04-18T19:49:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.