Modiff: Action-Conditioned 3D Motion Generation with Denoising Diffusion
Probabilistic Models
- URL: http://arxiv.org/abs/2301.03949v2
- Date: Tue, 28 Mar 2023 08:26:30 GMT
- Title: Modiff: Action-Conditioned 3D Motion Generation with Denoising Diffusion
Probabilistic Models
- Authors: Mengyi Zhao, Mengyuan Liu, Bin Ren, Shuling Dai, and Nicu Sebe
- Abstract summary: We propose a conditional paradigm that benefits from the denoising diffusion probabilistic model (DDPM) to tackle the problem of realistic and diverse action-conditioned 3D skeleton-based motion generation.
We are a pioneering attempt that uses DDPM to synthesize a variable number of motion sequences conditioned on a categorical action.
- Score: 58.357180353368896
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion-based generative models have recently emerged as powerful solutions
for high-quality synthesis in multiple domains. Leveraging the bidirectional
Markov chains, diffusion probabilistic models generate samples by inferring the
reversed Markov chain based on the learned distribution mapping at the forward
diffusion process. In this work, we propose Modiff, a conditional paradigm that
benefits from the denoising diffusion probabilistic model (DDPM) to tackle the
problem of realistic and diverse action-conditioned 3D skeleton-based motion
generation. We are a pioneering attempt that uses DDPM to synthesize a variable
number of motion sequences conditioned on a categorical action. We evaluate our
approach on the large-scale NTU RGB+D dataset and show improvements over
state-of-the-art motion generation methods.
Related papers
- Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced
Hierarchical Diffusion Model [60.27825196999742]
We propose a novel Basic-to-Advanced Hierarchical Diffusion Model, named B2A-HDM, to collaboratively exploit low-dimensional and high-dimensional diffusion models for detailed motion synthesis.
Specifically, the basic diffusion model in low-dimensional latent space provides the intermediate denoising result that is consistent with the textual description.
The advanced diffusion model in high-dimensional latent space focuses on the following detail-enhancing denoising process.
arXiv Detail & Related papers (2023-12-18T06:30:39Z) - DiffFlow: A Unified SDE Framework for Score-Based Diffusion Models and
Generative Adversarial Networks [41.451880167535776]
We propose a unified theoretic framework for explicit generative models (SDMs) and generative adversarial nets (GANs)
Under our unified theoretic framework, we introduce several instantiations of the DiffFLow that provide new algorithms beyond GANs and SDMs with exact likelihood inference.
arXiv Detail & Related papers (2023-07-05T10:00:53Z) - Semi-Implicit Denoising Diffusion Models (SIDDMs) [50.30163684539586]
Existing models such as Denoising Diffusion Probabilistic Models (DDPM) deliver high-quality, diverse samples but are slowed by an inherently high number of iterative steps.
We introduce a novel approach that tackles the problem by matching implicit and explicit factors.
We demonstrate that our proposed method obtains comparable generative performance to diffusion-based models and vastly superior results to models with a small number of sampling steps.
arXiv Detail & Related papers (2023-06-21T18:49:22Z) - A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE.
We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z) - CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion
Models [72.93652777646233]
Camouflaged Object Detection (COD) is a challenging task in computer vision due to the high similarity between camouflaged objects and their surroundings.
We propose a new paradigm that treats COD as a conditional mask-generation task leveraging diffusion models.
Our method, dubbed CamoDiffusion, employs the denoising process of diffusion models to iteratively reduce the noise of the mask.
arXiv Detail & Related papers (2023-05-29T07:49:44Z) - ShiftDDPMs: Exploring Conditional Diffusion Models by Shifting Diffusion
Trajectories [144.03939123870416]
We propose a novel conditional diffusion model by introducing conditions into the forward process.
We use extra latent space to allocate an exclusive diffusion trajectory for each condition based on some shifting rules.
We formulate our method, which we call textbfShiftDDPMs, and provide a unified point of view on existing related methods.
arXiv Detail & Related papers (2023-02-05T12:48:21Z) - Fast Inference in Denoising Diffusion Models via MMD Finetuning [23.779985842891705]
We present MMD-DDM, a novel method for fast sampling of diffusion models.
Our approach is based on the idea of using the Maximum Mean Discrepancy (MMD) to finetune the learned distribution with a given budget of timesteps.
Our findings show that the proposed method is able to produce high-quality samples in a fraction of the time required by widely-used diffusion models.
arXiv Detail & Related papers (2023-01-19T09:48:07Z) - Structured Denoising Diffusion Models in Discrete State-Spaces [15.488176444698404]
We introduce Discrete Denoising Diffusion Probabilistic Models (D3PMs) for discrete data.
The choice of transition matrix is an important design decision that leads to improved results in image and text domains.
For text, this model class achieves strong results on character-level text generation while scaling to large vocabularies on LM1B.
arXiv Detail & Related papers (2021-07-07T04:11:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.