Taming Diffusion Probabilistic Models for Character Control
- URL: http://arxiv.org/abs/2404.15121v1
- Date: Tue, 23 Apr 2024 15:20:17 GMT
- Title: Taming Diffusion Probabilistic Models for Character Control
- Authors: Rui Chen, Mingyi Shi, Shaoli Huang, Ping Tan, Taku Komura, Xuelin Chen,
- Abstract summary: We present a novel character control framework that responds in real-time to a variety of user-supplied control signals.
At the heart of our method lies a transformer-based Conditional Autoregressive Motion Diffusion Model.
Our work represents the first model that enables real-time generation of high-quality, diverse character animations.
- Score: 46.52584236101806
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present a novel character control framework that effectively utilizes motion diffusion probabilistic models to generate high-quality and diverse character animations, responding in real-time to a variety of dynamic user-supplied control signals. At the heart of our method lies a transformer-based Conditional Autoregressive Motion Diffusion Model (CAMDM), which takes as input the character's historical motion and can generate a range of diverse potential future motions conditioned on high-level, coarse user control. To meet the demands for diversity, controllability, and computational efficiency required by a real-time controller, we incorporate several key algorithmic designs. These include separate condition tokenization, classifier-free guidance on past motion, and heuristic future trajectory extension, all designed to address the challenges associated with taming motion diffusion probabilistic models for character control. As a result, our work represents the first model that enables real-time generation of high-quality, diverse character animations based on user interactive control, supporting animating the character in multiple styles with a single unified model. We evaluate our method on a diverse set of locomotion skills, demonstrating the merits of our method over existing character controllers. Project page and source codes: https://aiganimation.github.io/CAMDM/
Related papers
- Generating Physically Realistic and Directable Human Motions from Multi-Modal Inputs [16.41735119504929]
This work focuses on generating realistic, physically-based human behaviors from multi-modal inputs, which may only partially specify the desired motion.
The input may come from a VR controller providing arm motion and body velocity, partial key-point animation, computer vision applied to videos, or even higher-level motion goals.
We introduce the Masked Humanoid Controller (MHC), a novel approach that applies multi-objective imitation learning on augmented and selectively masked motion demonstrations.
arXiv Detail & Related papers (2025-02-08T17:02:11Z) - InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint [67.6297384588837]
We introduce a novel controllable motion generation method, InterControl, to encourage the synthesized motions maintaining the desired distance between joint pairs.
We demonstrate that the distance between joint pairs for human-wise interactions can be generated using an off-the-shelf Large Language Model.
arXiv Detail & Related papers (2023-11-27T14:32:33Z) - Real-time Animation Generation and Control on Rigged Models via Large
Language Models [50.034712575541434]
We introduce a novel method for real-time animation control and generation on rigged models using natural language input.
We embed a large language model (LLM) in Unity to output structured texts that can be parsed into diverse and realistic animations.
arXiv Detail & Related papers (2023-10-27T01:36:35Z) - RSMT: Real-time Stylized Motion Transition for Characters [15.856276818061891]
We propose a Real-time Stylized Motion Transition method (RSMT) to achieve all aforementioned goals.
Our method consists of two critical, independent components: a general motion manifold model and a style motion sampler.
Our method proves to be fast, high-quality, versatile, and controllable.
arXiv Detail & Related papers (2023-06-21T01:50:04Z) - Interactive Character Control with Auto-Regressive Motion Diffusion Models [18.727066177880708]
We propose A-MDM (Auto-regressive Motion Diffusion Model) for real-time motion synthesis.
Our conditional diffusion model takes an initial pose as input, and auto-regressively generates successive motion frames conditioned on previous frame.
We introduce a suite of techniques for incorporating interactive controls into A-MDM, such as task-oriented sampling, in-painting, and hierarchical reinforcement learning.
arXiv Detail & Related papers (2023-06-01T07:48:34Z) - CALM: Conditional Adversarial Latent Models for Directable Virtual
Characters [71.66218592749448]
We present Conditional Adversarial Latent Models (CALM), an approach for generating diverse and directable behaviors for user-controlled interactive virtual characters.
Using imitation learning, CALM learns a representation of movement that captures the complexity of human motion, and enables direct control over character movements.
arXiv Detail & Related papers (2023-05-02T09:01:44Z) - Executing your Commands via Motion Diffusion in Latent Space [51.64652463205012]
We propose a Motion Latent-based Diffusion model (MLD) to produce vivid motion sequences conforming to the given conditional inputs.
Our MLD achieves significant improvements over the state-of-the-art methods among extensive human motion generation tasks.
arXiv Detail & Related papers (2022-12-08T03:07:00Z) - Pretrained Diffusion Models for Unified Human Motion Synthesis [33.41816844381057]
MoFusion is a framework for unified motion synthesis.
It employs a Transformer backbone to ease the inclusion of diverse control signals.
It also supports multi-granularity synthesis ranging from motion completion of a body part to whole-body motion generation.
arXiv Detail & Related papers (2022-12-06T09:19:21Z) - TEMOS: Generating diverse human motions from textual descriptions [53.85978336198444]
We address the problem of generating diverse 3D human motions from textual descriptions.
We propose TEMOS, a text-conditioned generative model leveraging variational autoencoder (VAE) training with human motion data.
We show that TEMOS framework can produce both skeleton-based animations as in prior work, as well more expressive SMPL body motions.
arXiv Detail & Related papers (2022-04-25T14:53:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.