Related papers: AAMDM: Accelerated Auto-regressive Motion Diffusion Model

AAMDM: Accelerated Auto-regressive Motion Diffusion Model

URL: http://arxiv.org/abs/2401.06146v1
Date: Sat, 2 Dec 2023 23:52:21 GMT
Title: AAMDM: Accelerated Auto-regressive Motion Diffusion Model
Authors: Tianyu Li, Calvin Qiao, Guanqiao Ren, KangKang Yin, Sehoon Ha
Abstract summary: This paper introduces the Accelerated Auto-regressive Motion Diffusion Model (AAMDM) AAMDM is a novel motion synthesis framework designed to achieve quality, diversity, and efficiency all together. We show that AAMDM outperforms existing methods in motion quality, diversity, and runtime efficiency.
Score: 10.94879097495769
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Interactive motion synthesis is essential in creating immersive experiences in entertainment applications, such as video games and virtual reality. However, generating animations that are both high-quality and contextually responsive remains a challenge. Traditional techniques in the game industry can produce high-fidelity animations but suffer from high computational costs and poor scalability. Trained neural network models alleviate the memory and speed issues, yet fall short on generating diverse motions. Diffusion models offer diverse motion synthesis with low memory usage, but require expensive reverse diffusion processes. This paper introduces the Accelerated Auto-regressive Motion Diffusion Model (AAMDM), a novel motion synthesis framework designed to achieve quality, diversity, and efficiency all together. AAMDM integrates Denoising Diffusion GANs as a fast Generation Module, and an Auto-regressive Diffusion Model as a Polishing Module. Furthermore, AAMDM operates in a lower-dimensional embedded space rather than the full-dimensional pose space, which reduces the training complexity as well as further improves the performance. We show that AAMDM outperforms existing methods in motion quality, diversity, and runtime efficiency, through comprehensive quantitative analyses and visual comparisons. We also demonstrate the effectiveness of each algorithmic component through ablation studies.

Related papers

Image Motion Blur Removal in the Temporal Dimension with Video Diffusion Models [3.052019331122618]
We propose a novel single-image deblurring approach that treats motion blur as a temporal averaging phenomenon. Our core innovation lies in leveraging a pre-trained video diffusion transformer model to capture diverse motion dynamics. Empirical results on synthetic and real-world datasets demonstrate that our method outperforms existing techniques in deblurring complex motion blur scenarios.
arXiv Detail & Related papers (2025-01-22T03:01:54Z)
Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling [10.914612535745789]
This paper introduces Motion-oriented Compositional Neural Radiance Fields (MoCo-NeRF) MoCo-NeRF is a framework designed to perform free-viewpoint rendering of monocular human videos.
arXiv Detail & Related papers (2024-07-16T17:59:01Z)
StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework [58.31279280316741]
We present StableMoFusion, a robust and efficient framework for human motion generation. We tailor each component for efficient high-quality human motion generation. We identify foot-ground contact and correct foot motions along the denoising process.
arXiv Detail & Related papers (2024-05-09T11:41:27Z)
Spectral Motion Alignment for Video Motion Transfer using Diffusion Models [54.32923808964701]
Spectral Motion Alignment (SMA) is a framework that refines and aligns motion vectors using Fourier and wavelet transforms. SMA learns motion patterns by incorporating frequency-domain regularization, facilitating the learning of whole-frame global motion dynamics. Extensive experiments demonstrate SMA's efficacy in improving motion transfer while maintaining computational efficiency and compatibility across various video customization frameworks.
arXiv Detail & Related papers (2024-03-22T14:47:18Z)
Motion Flow Matching for Human Motion Synthesis and Editing [75.13665467944314]
We propose emphMotion Flow Matching, a novel generative model for human motion generation featuring efficient sampling and effectiveness in motion editing applications. Our method reduces the sampling complexity from thousand steps in previous diffusion models to just ten steps, while achieving comparable performance in text-to-motion and action-to-motion generation benchmarks.
arXiv Detail & Related papers (2023-12-14T12:57:35Z)
EMDM: Efficient Motion Diffusion Model for Fast and High-Quality Motion Generation [57.539634387672656]
Current state-of-the-art generative diffusion models have produced impressive results but struggle to achieve fast generation without sacrificing quality. We introduce Efficient Motion Diffusion Model (EMDM) for fast and high-quality human motion generation.
arXiv Detail & Related papers (2023-12-04T18:58:38Z)
Interactive Character Control with Auto-Regressive Motion Diffusion Models [18.727066177880708]
We propose A-MDM (Auto-regressive Motion Diffusion Model) for real-time motion synthesis. Our conditional diffusion model takes an initial pose as input, and auto-regressively generates successive motion frames conditioned on previous frame. We introduce a suite of techniques for incorporating interactive controls into A-MDM, such as task-oriented sampling, in-painting, and hierarchical reinforcement learning.
arXiv Detail & Related papers (2023-06-01T07:48:34Z)
LaMD: Latent Motion Diffusion for Image-Conditional Video Generation [63.34574080016687]
latent motion diffusion (LaMD) framework consists of a motion-decomposed video autoencoder and a diffusion-based motion generator. LaMD generates high-quality videos on various benchmark datasets, including BAIR, Landscape, NATOPS, MUG and CATER-GEN.
arXiv Detail & Related papers (2023-04-23T10:32:32Z)
Single Motion Diffusion [33.81898532874481]
We present SinMDM, a model designed to learn the internal motifs of a single motion sequence with arbitrary topology and synthesize motions of arbitrary length that are faithful to them. SinMDM can be applied in various contexts, including spatial and temporal in-betweening, motion expansion, style transfer, and crowd animation. Our results show that SinMDM outperforms existing methods both in quality and time-space efficiency.
arXiv Detail & Related papers (2023-02-12T13:02:19Z)
MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis [73.52948992990191]
MoFusion is a new denoising-diffusion-based framework for high-quality conditional human motion synthesis. We present ways to introduce well-known kinematic losses for motion plausibility within the motion diffusion framework. We demonstrate the effectiveness of MoFusion compared to the state of the art on established benchmarks in the literature.
arXiv Detail & Related papers (2022-12-08T18:59:48Z)
Towards Lightweight Neural Animation : Exploration of Neural Network Pruning in Mixture of Experts-based Animation Models [3.1733862899654652]
We apply pruning algorithms to compress a neural network in the context of interactive character animation. With the same number of experts and parameters, the pruned model produces less motion artifacts than the dense model. This work demonstrates that, with the same number of experts and parameters, the pruned model produces less motion artifacts than the dense model.
arXiv Detail & Related papers (2022-01-11T16:39:32Z)
Enhanced Quadratic Video Interpolation [56.54662568085176]
We propose an enhanced quadratic video (EQVI) model to handle more complicated scenes and motion patterns. To further boost the performance, we devise a novel multi-scale fusion network (MS-Fusion) which can be regarded as a learnable augmentation process. The proposed EQVI model won the first place in the AIM 2020 Video Temporal Super-Resolution Challenge.
arXiv Detail & Related papers (2020-09-10T02:31:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.