Generative Human Motion Stylization in Latent Space
- URL: http://arxiv.org/abs/2401.13505v2
- Date: Sat, 24 Feb 2024 03:18:17 GMT
- Title: Generative Human Motion Stylization in Latent Space
- Authors: Chuan Guo, Yuxuan Mu, Xinxin Zuo, Peng Dai, Youliang Yan, Juwei Lu, Li
Cheng
- Abstract summary: We present a novel generative model that produces diverse stylization results of a single motion (latent) code.
In inference, users can opt to stylize a motion using style cues from a reference motion or a label.
Experimental results show that our proposed stylization models, despite their lightweight design, outperform the state-of-the-art in style reenactment, content preservation, and generalization.
- Score: 42.831468727082694
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Human motion stylization aims to revise the style of an input motion while
keeping its content unaltered. Unlike existing works that operate directly in
pose space, we leverage the latent space of pretrained autoencoders as a more
expressive and robust representation for motion extraction and infusion.
Building upon this, we present a novel generative model that produces diverse
stylization results of a single motion (latent) code. During training, a motion
code is decomposed into two coding components: a deterministic content code,
and a probabilistic style code adhering to a prior distribution; then a
generator massages the random combination of content and style codes to
reconstruct the corresponding motion codes. Our approach is versatile, allowing
the learning of probabilistic style space from either style labeled or
unlabeled motions, providing notable flexibility in stylization as well. In
inference, users can opt to stylize a motion using style cues from a reference
motion or a label. Even in the absence of explicit style input, our model
facilitates novel re-stylization by sampling from the unconditional style prior
distribution. Experimental results show that our proposed stylization models,
despite their lightweight design, outperform the state-of-the-art in style
reenactment, content preservation, and generalization across various
applications and settings. Project Page: https://murrol.github.io/GenMoStyle
Related papers
- SMooDi: Stylized Motion Diffusion Model [46.293854851116215]
We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized motion driven by content texts and style sequences.
Our proposed framework outperforms existing methods in stylized motion generation.
arXiv Detail & Related papers (2024-07-17T17:59:42Z) - DPStyler: Dynamic PromptStyler for Source-Free Domain Generalization [43.67213274161226]
Source-Free Domain Generalization (SFDG) aims to develop a model that works for unseen target domains without relying on any source domain.
Research in SFDG primarily bulids upon the existing knowledge of large-scale vision-language models.
We introduce Dynamic PromptStyler (DPStyler), comprising Style Generation and Style Removal modules.
arXiv Detail & Related papers (2024-03-25T12:31:01Z) - Say Anything with Any Style [9.50806457742173]
Say Anything withAny Style queries the discrete style representation via a generative model with a learned style codebook.
Our approach surpasses state-of-theart methods in terms of both lip-synchronization and stylized expression.
arXiv Detail & Related papers (2024-03-11T01:20:03Z) - MotionCrafter: One-Shot Motion Customization of Diffusion Models [66.44642854791807]
We introduce MotionCrafter, a one-shot instance-guided motion customization method.
MotionCrafter employs a parallel spatial-temporal architecture that injects the reference motion into the temporal component of the base model.
During training, a frozen base model provides appearance normalization, effectively separating appearance from motion.
arXiv Detail & Related papers (2023-12-08T16:31:04Z) - Customizing Motion in Text-to-Video Diffusion Models [79.4121510826141]
We introduce an approach for augmenting text-to-video generation models with customized motions.
By leveraging a few video samples demonstrating specific movements as input, our method learns and generalizes the input motion patterns for diverse, text-specified scenarios.
arXiv Detail & Related papers (2023-12-07T18:59:03Z) - ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech [6.8527462303619195]
We present ZeroEGGS, a neural network framework for speech-driven gesture generation with zero-shot style control by example.
Our model uses a Variational framework to learn a style embedding, making it easy to modify style through latent space manipulation or blending and scaling of style embeddings.
In a user study, we show that our model outperforms previous state-of-the-art techniques in naturalness of motion, for speech, and style portrayal.
arXiv Detail & Related papers (2022-09-15T18:34:30Z) - MoDi: Unconditional Motion Synthesis from Diverse Data [51.676055380546494]
We present MoDi, an unconditional generative model that synthesizes diverse motions.
Our model is trained in a completely unsupervised setting from a diverse, unstructured and unlabeled motion dataset.
We show that despite the lack of any structure in the dataset, the latent space can be semantically clustered.
arXiv Detail & Related papers (2022-06-16T09:06:25Z) - Unpaired Motion Style Transfer from Video to Animation [74.15550388701833]
Transferring the motion style from one animation clip to another, while preserving the motion content of the latter, has been a long-standing problem in character animation.
We present a novel data-driven framework for motion style transfer, which learns from an unpaired collection of motions with style labels.
Our framework is able to extract motion styles directly from videos, bypassing 3D reconstruction, and apply them to the 3D input motion.
arXiv Detail & Related papers (2020-05-12T13:21:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.