Related papers: MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model

MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model

URL: http://arxiv.org/abs/2404.19759v3
Date: Mon, 30 Dec 2024 08:43:06 GMT
Title: MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model
Authors: Wenxun Dai, Ling-Hao Chen, Jingbo Wang, Jinpeng Liu, Bo Dai, Yansong Tang,
Abstract summary: This work introduces MotionLCM, extending controllable motion generation to a real-time level.<n>We first propose the motion latent consistency model (MotionLCM) for motion generation, building on the motion latent diffusion model.<n>By adopting one-step (or few-step) inference, we further improve the runtime efficiency of the motion latent diffusion model for motion generation.
Score: 29.93359157128045
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This work introduces MotionLCM, extending controllable motion generation to a real-time level. Existing methods for spatial-temporal control in text-conditioned motion generation suffer from significant runtime inefficiency. To address this issue, we first propose the motion latent consistency model (MotionLCM) for motion generation, building on the motion latent diffusion model. By adopting one-step (or few-step) inference, we further improve the runtime efficiency of the motion latent diffusion model for motion generation. To ensure effective controllability, we incorporate a motion ControlNet within the latent space of MotionLCM and enable explicit control signals (i.e., initial motions) in the vanilla motion space to further provide supervision for the training process. By employing these techniques, our approach can generate human motions with text and control signals in real-time. Experimental results demonstrate the remarkable generation and controlling capabilities of MotionLCM while maintaining real-time runtime efficiency.

Related papers

MotionVerse: A Unified Multimodal Framework for Motion Comprehension, Generation and Editing [53.98607267063729]
MotionVerse is a framework to comprehend, generate, and edit human motion in both single-person and multi-person scenarios.<n>We employ a motion tokenizer with residual quantization, which converts continuous motion sequences into multi-stream discrete tokens.<n>We also introduce a textitDelay Parallel Modeling strategy, which temporally staggers the encoding of residual token streams.
arXiv Detail & Related papers (2025-09-28T04:20:56Z)
Diffuse-CLoC: Guided Diffusion for Physics-based Character Look-ahead Control [16.319698848279966]
We present Diffuse-CLoC, a guided diffusion framework for physics-based look-ahead control. It enables intuitive, steerable, and physically realistic motion generation.
arXiv Detail & Related papers (2025-03-14T18:42:29Z)
Mojito: Motion Trajectory and Intensity Control for Video Generation [79.85687620761186]
This paper introduces Mojito, a diffusion model that incorporates both motion trajectory and intensity control for text-to-video generation. Experiments demonstrate Mojito's effectiveness in achieving precise trajectory and intensity control with high computational efficiency.
arXiv Detail & Related papers (2024-12-12T05:26:43Z)
MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding [76.30210465222218]
MotionGPT-2 is a unified Large Motion-Language Model (LMLMLM) It supports multimodal control conditions through pre-trained Large Language Models (LLMs) It is highly adaptable to the challenging 3D holistic motion generation task.
arXiv Detail & Related papers (2024-10-29T05:25:34Z)
ControlMM: Controllable Masked Motion Generation [38.16884934336603]
We propose ControlMM, a novel approach incorporating spatial control signals into the generative masked motion model. ControlMM achieves real-time, high-fidelity, and high-precision controllable motion generation simultaneously. ControlMM generates motions 20 times faster than diffusion-based methods.
arXiv Detail & Related papers (2024-10-14T17:50:27Z)
DART: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control [12.465927271402442]
Text-conditioned human motion generation allows for user interaction through natural language. DART is a Diffusion-based Autoregressive motion primitive model for Real-time Text-driven motion control. We present effective algorithms for both approaches, demonstrating our model's versatility and superior performance in various motion synthesis tasks.
arXiv Detail & Related papers (2024-10-07T17:58:22Z)
Real-time Diverse Motion In-betweening with Space-time Control [4.910937238451485]
In this work, we present a data-driven framework for generating diverse in-betweening motions for kinematic characters. We demonstrate that our in-betweening approach can synthesize both locomotion and unstructured motions, enabling rich, versatile, and high-quality animation generation.
arXiv Detail & Related papers (2024-09-30T22:45:53Z)
Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation [52.87672306545577]
Existing motion generation methods primarily focus on the direct synthesis of global motions. We propose the local action-guided motion diffusion model, which facilitates global motion generation by utilizing local actions as fine-grained control signals. Our method provides flexibility in seamlessly combining various local actions and continuous guiding weight adjustment.
arXiv Detail & Related papers (2024-07-15T08:35:00Z)
Generalizable Implicit Motion Modeling for Video Frame Interpolation [51.966062283735596]
Motion is critical in flow-based Video Frame Interpolation (VFI) We introduce General Implicit Motion Modeling (IMM), a novel and effective approach to motion modeling VFI. Our GIMM can be easily integrated with existing flow-based VFI works by supplying accurately modeled motion.
arXiv Detail & Related papers (2024-07-11T17:13:15Z)
FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis [65.85686550683806]
This paper reconsiders motion generation and proposes to unify the single and multi-person motion by the conditional motion distribution. Based on our framework, the current single-person motion spatial control method could be seamlessly integrated, achieving precise control of multi-person motion.
arXiv Detail & Related papers (2024-05-24T17:57:57Z)
FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning [19.491968038335944]
We introduce a self-supervised, structured representation and generation method that extracts spatial-temporal relationships in periodic or quasi-periodic motions. Our work opens new possibilities for future advancements in general motion representation and learning algorithms.
arXiv Detail & Related papers (2024-02-21T13:59:21Z)
Motion Flow Matching for Human Motion Synthesis and Editing [75.13665467944314]
We propose emphMotion Flow Matching, a novel generative model for human motion generation featuring efficient sampling and effectiveness in motion editing applications. Our method reduces the sampling complexity from thousand steps in previous diffusion models to just ten steps, while achieving comparable performance in text-to-motion and action-to-motion generation benchmarks.
arXiv Detail & Related papers (2023-12-14T12:57:35Z)
EMDM: Efficient Motion Diffusion Model for Fast and High-Quality Motion Generation [57.539634387672656]
Current state-of-the-art generative diffusion models have produced impressive results but struggle to achieve fast generation without sacrificing quality. We introduce Efficient Motion Diffusion Model (EMDM) for fast and high-quality human motion generation.
arXiv Detail & Related papers (2023-12-04T18:58:38Z)
Guided Motion Diffusion for Controllable Human Motion Synthesis [18.660523853430497]
We propose Guided Motion Diffusion (GMD), a method that incorporates spatial constraints into the motion generation process. Specifically, we propose an effective feature projection scheme that manipulates motion representation to enhance the coherency between spatial information and local poses. Our experiments justify the development of GMD, which achieves a significant improvement over state-of-the-art methods in text-based motion generation.
arXiv Detail & Related papers (2023-05-21T21:54:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.