Related papers: StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework

StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework

URL: http://arxiv.org/abs/2405.05691v1
Date: Thu, 9 May 2024 11:41:27 GMT
Title: StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework
Authors: Yiheng Huang, Hui Yang, Chuanchen Luo, Yuxi Wang, Shibiao Xu, Zhaoxiang Zhang, Man Zhang, Junran Peng,
Abstract summary: We present StableMoFusion, a robust and efficient framework for human motion generation. We tailor each component for efficient high-quality human motion generation. We identify foot-ground contact and correct foot motions along the denoising process.
Score: 58.31279280316741
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Thanks to the powerful generative capacity of diffusion models, recent years have witnessed rapid progress in human motion generation. Existing diffusion-based methods employ disparate network architectures and training strategies. The effect of the design of each component is still unclear. In addition, the iterative denoising process consumes considerable computational overhead, which is prohibitive for real-time scenarios such as virtual characters and humanoid robots. For this reason, we first conduct a comprehensive investigation into network architectures, training strategies, and inference processs. Based on the profound analysis, we tailor each component for efficient high-quality human motion generation. Despite the promising performance, the tailored model still suffers from foot skating which is an ubiquitous issue in diffusion-based solutions. To eliminate footskate, we identify foot-ground contact and correct foot motions along the denoising process. By organically combining these well-designed components together, we present StableMoFusion, a robust and efficient framework for human motion generation. Extensive experimental results show that our StableMoFusion performs favorably against current state-of-the-art methods. Project page: https://h-y1heng.github.io/StableMoFusion-page/

Related papers

DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers [86.5541501589166]
DiffMoE is a batch-level global token pool that enables experts to access global token distributions during training. It achieves state-of-the-art performance among diffusion models on ImageNet benchmark. The effectiveness of our approach extends beyond class-conditional generation to more challenging tasks such as text-to-image generation.
arXiv Detail & Related papers (2025-03-18T17:57:07Z)
One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step. To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration. Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z)
FRMD: Fast Robot Motion Diffusion with Consistency-Distilled Movement Primitives for Smooth Action Generation [3.7351623987275873]
We propose Fast Robot Motion Diffusion to generate smooth, temporally consistent robot motions. Our method integrates Movement Primitives (MPs) with Consistency Models to enable efficient, single-step trajectory generation. Our results show that FRMD generates significantly faster, smoother trajectories while achieving higher success rates.
arXiv Detail & Related papers (2025-03-03T20:56:39Z)
ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model [9.525806425270428]
We present emphReinDiffuse that combines reinforcement learning with motion diffusion model to generate physically credible human motions. Our method adapts Motion Diffusion Model to output a parameterized distribution of actions, making them compatible with reinforcement learning paradigms. Our approach outperforms existing state-of-the-art models on two major datasets, HumanML3D and KIT-ML.
arXiv Detail & Related papers (2024-10-09T16:24:11Z)
Flexiffusion: Segment-wise Neural Architecture Search for Flexible Denoising Schedule [50.260693393896716]
Diffusion models are cutting-edge generative models adept at producing diverse, high-quality images. Recent techniques have been employed to automatically search for faster generation processes. We introduce Flexiffusion, a novel training-free NAS paradigm designed to accelerate diffusion models.
arXiv Detail & Related papers (2024-09-26T06:28:05Z)
MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation [19.999239668765885]
MotionMix is a weakly-supervised diffusion model that leverages both noisy and unannotated motion sequences. Our framework consistently achieves state-of-the-art performances on text-to-motion, action-to-motion, and music-to-dance tasks.
arXiv Detail & Related papers (2024-01-20T04:58:06Z)
Motion Flow Matching for Human Motion Synthesis and Editing [75.13665467944314]
We propose emphMotion Flow Matching, a novel generative model for human motion generation featuring efficient sampling and effectiveness in motion editing applications. Our method reduces the sampling complexity from thousand steps in previous diffusion models to just ten steps, while achieving comparable performance in text-to-motion and action-to-motion generation benchmarks.
arXiv Detail & Related papers (2023-12-14T12:57:35Z)
AAMDM: Accelerated Auto-regressive Motion Diffusion Model [10.94879097495769]
This paper introduces the Accelerated Auto-regressive Motion Diffusion Model (AAMDM) AAMDM is a novel motion synthesis framework designed to achieve quality, diversity, and efficiency all together. We show that AAMDM outperforms existing methods in motion quality, diversity, and runtime efficiency.
arXiv Detail & Related papers (2023-12-02T23:52:21Z)
DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models [102.13968267347553]
We present DiffuseBot, a physics-augmented diffusion model that generates soft robot morphologies capable of excelling in a wide spectrum of tasks. We showcase a range of simulated and fabricated robots along with their capabilities.
arXiv Detail & Related papers (2023-11-28T18:58:48Z)
Unifying Human Motion Synthesis and Style Transfer with Denoising Diffusion Probabilistic Models [9.789705536694665]
Generating realistic motions for digital humans is a core but challenging part of computer animations and games. We propose a denoising diffusion model solution for styled motion synthesis. We design a multi-task architecture of diffusion model that strategically generates aspects of human motions for local guidance.
arXiv Detail & Related papers (2022-12-16T15:15:34Z)
MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis [73.52948992990191]
MoFusion is a new denoising-diffusion-based framework for high-quality conditional human motion synthesis. We present ways to introduce well-known kinematic losses for motion plausibility within the motion diffusion framework. We demonstrate the effectiveness of MoFusion compared to the state of the art on established benchmarks in the literature.
arXiv Detail & Related papers (2022-12-08T18:59:48Z)
PhysDiff: Physics-Guided Human Motion Diffusion Model [101.1823574561535]
Existing motion diffusion models largely disregard the laws of physics in the diffusion process. PhysDiff incorporates physical constraints into the diffusion process. Our approach achieves state-of-the-art motion quality and improves physical plausibility drastically.
arXiv Detail & Related papers (2022-12-05T18:59:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.