MUGL: Large Scale Multi Person Conditional Action Generation with
Locomotion
- URL: http://arxiv.org/abs/2110.11460v1
- Date: Thu, 21 Oct 2021 20:11:53 GMT
- Title: MUGL: Large Scale Multi Person Conditional Action Generation with
Locomotion
- Authors: Shubh Maheshwari, Debtanu Gupta, Ravi Kiran Sarvadevabhatla
- Abstract summary: MUGL is a novel deep neural model for large-scale, diverse generation of single and multi-person pose-based action sequences with locomotion.
Our controllable approach enables variable-length generations customizable by action category, across more than 100 categories.
- Score: 9.30315673109153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce MUGL, a novel deep neural model for large-scale, diverse
generation of single and multi-person pose-based action sequences with
locomotion. Our controllable approach enables variable-length generations
customizable by action category, across more than 100 categories. To enable
intra/inter-category diversity, we model the latent generative space using a
Conditional Gaussian Mixture Variational Autoencoder. To enable realistic
generation of actions involving locomotion, we decouple local pose and global
trajectory components of the action sequence. We incorporate duration-aware
feature representations to enable variable-length sequence generation. We use a
hybrid pose sequence representation with 3D pose sequences sourced from videos
and 3D Kinect-based sequences of NTU-RGBD-120. To enable principled comparison
of generation quality, we employ suitably modified strong baselines during
evaluation. Although smaller and simpler compared to baselines, MUGL provides
better quality generations, paving the way for practical and controllable
large-scale human action generation.
Related papers
- Diffusion Transformer Policy [48.50988753948537]
Diffusion Transformer Policy pretrained on diverse robot data can generalize to different embodiments.
The proposed approach achieves state-of-the-art performance with only a single third-view camera stream in the Calvin novel task setting.
arXiv Detail & Related papers (2024-10-21T12:43:54Z) - Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space.
We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z) - Hierarchical Generation of Human-Object Interactions with Diffusion
Probabilistic Models [71.64318025625833]
This paper presents a novel approach to generating the 3D motion of a human interacting with a target object.
Our framework first generates a set of milestones and then synthesizes the motion along them.
The experiments on the NSM, COUCH, and SAMP datasets show that our approach outperforms previous methods by a large margin in both quality and diversity.
arXiv Detail & Related papers (2023-10-03T17:50:23Z) - Example-based Motion Synthesis via Generative Motion Matching [44.20519633463265]
We present GenMM, a generative model that "mines" as many diverse motions as possible from a single or few example sequences.
GenMM inherits the training-free nature and the superior quality of the well-known Motion Matching method.
arXiv Detail & Related papers (2023-06-01T06:19:33Z) - Action-conditioned On-demand Motion Generation [11.45641608124365]
We propose a novel framework, On-Demand MOtion Generation (ODMO), for generating realistic and diverse long-term 3D human motion sequences.
ODMO shows improvements over SOTA approaches on all traditional motion evaluation metrics when evaluated on three public datasets.
arXiv Detail & Related papers (2022-07-17T13:04:44Z) - HiT-DVAE: Human Motion Generation via Hierarchical Transformer Dynamical
VAE [37.23381308240617]
We propose Hierarchical Transformer Dynamical Variational Autoencoder, HiT-DVAE, which implements auto-regressive generation with transformer-like attention mechanisms.
We evaluate the proposed method on HumanEva-I and Human3.6M with various evaluation methods, and outperform the state-of-the-art methods on most of the metrics.
arXiv Detail & Related papers (2022-04-04T15:12:34Z) - ActFormer: A GAN Transformer Framework towards General
Action-Conditioned 3D Human Motion Generation [16.1094669439815]
We present a GAN Transformer framework for general action-conditioned 3D human motion generation.
Our approach consists of a powerful Action-conditioned transFormer (ActFormer) under a GAN training scheme.
ActFormer can be naturally extended to multi-person motions by alternately modeling temporal correlations and human interactions with Transformer encoders.
arXiv Detail & Related papers (2022-03-15T07:50:12Z) - AniFormer: Data-driven 3D Animation with Transformer [95.45760189583181]
We present a novel task, i.e., animating a target 3D object through the motion of a raw driving sequence.
AniFormer generates animated 3D sequences by directly taking the raw driving sequences and arbitrary same-type target meshes as inputs.
Our AniFormer achieves high-fidelity, realistic, temporally coherent animated results and outperforms compared start-of-the-art methods on benchmarks of diverse categories.
arXiv Detail & Related papers (2021-10-20T12:36:55Z) - Generating Smooth Pose Sequences for Diverse Human Motion Prediction [90.45823619796674]
We introduce a unified deep generative network for both diverse and controllable motion prediction.
Our experiments on two standard benchmark datasets, Human3.6M and HumanEva-I, demonstrate that our approach outperforms the state-of-the-art baselines in terms of both sample diversity and accuracy.
arXiv Detail & Related papers (2021-08-19T00:58:00Z) - STAR: Sparse Transformer-based Action Recognition [61.490243467748314]
This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data.
Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference.
arXiv Detail & Related papers (2021-07-15T02:53:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.