Generative Motion Stylization of Cross-structure Characters within Canonical Motion Space
- URL: http://arxiv.org/abs/2403.11469v2
- Date: Tue, 23 Jul 2024 11:43:31 GMT
- Title: Generative Motion Stylization of Cross-structure Characters within Canonical Motion Space
- Authors: Jiaxu Zhang, Xin Chen, Gang Yu, Zhigang Tu,
- Abstract summary: We propose a generative motion stylization pipeline, named MotionS, for diverse and stylized motion on cross-structure characters.
Our key insight is to embed motion style into a cross-modality latent space, allowing for motion stylization within a canonical motion space.
- Score: 28.628241993271647
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stylized motion breathes life into characters. However, the fixed skeleton structure and style representation hinder existing data-driven motion synthesis methods from generating stylized motion for various characters. In this work, we propose a generative motion stylization pipeline, named MotionS, for synthesizing diverse and stylized motion on cross-structure characters using cross-modality style prompts. Our key insight is to embed motion style into a cross-modality latent space and perceive the cross-structure skeleton topologies, allowing for motion stylization within a canonical motion space. Specifically, the large-scale Contrastive-Language-Image-Pre-training (CLIP) model is leveraged to construct the cross-modality latent space, enabling flexible style representation within it. Additionally, two topology-encoded tokens are learned to capture the canonical and specific skeleton topologies, facilitating cross-structure topology shifting. Subsequently, the topology-shifted stylization diffusion is designed to generate motion content for the particular skeleton and stylize it in the shifted canonical motion space using multi-modality style descriptions. Through an extensive set of examples, we demonstrate the flexibility and generalizability of our pipeline across various characters and style descriptions. Qualitative and quantitative comparisons show the superiority of our pipeline over state-of-the-arts, consistently delivering high-quality stylized motion across a broad spectrum of skeletal structures.
Related papers
- PALUM: Part-based Attention Learning for Unified Motion Retargeting [53.17113525688095]
Remotion between characters with different skeleton structures is a fundamental challenge in computer animation.<n>We present a novel approach that learns common motion representations across diverse skeleton topologies.<n>Experiments demonstrate superior performance in handling diverse skeletal structures while maintaining motion realism and semantic fidelity.
arXiv Detail & Related papers (2026-01-12T07:29:44Z) - Topology-Agnostic Animal Motion Generation from Text Prompt [16.557163253248817]
We introduce OmniZoo, a large-scale animal motion dataset spanning 140 species and 32,979 sequences.<n>We propose a generalized autoregressive motion generation framework capable of producing text-driven motions for arbitrary skeletal topologies.
arXiv Detail & Related papers (2025-12-11T07:08:29Z) - ClusterStyle: Modeling Intra-Style Diversity with Prototypical Clustering for Stylized Motion Generation [33.75564496181951]
We propose a clustering-based framework, ClusterStyle, to address intra-style diversity.<n>We leverage a set of prototypes to model diverse style patterns across motions belonging to the same style category.<n>Our approach outperforms existing state-of-the-art models in stylized motion generation and motion style transfer.
arXiv Detail & Related papers (2025-12-02T06:24:14Z) - StyleMotif: Multi-Modal Motion Stylization using Style-Content Cross Fusion [14.213279927964903]
StyleMotif is a novel Stylized Motion Latent Diffusion model.
It generates motion conditioned on both content and style from multiple modalities.
arXiv Detail & Related papers (2025-03-27T17:59:46Z) - AnyTop: Character Animation Diffusion with Any Topology [54.07731933876742]
We introduce AnyTop, a diffusion model that generates motions for diverse characters with distinct motion dynamics.
Our work features a transformer-based denoising network, tailored for arbitrary skeleton learning.
Our evaluation demonstrates that AnyTops well, even with as few as three training examples per topology, and can produce motions for unseen skeletons as well.
arXiv Detail & Related papers (2025-02-24T17:00:36Z) - SMooDi: Stylized Motion Diffusion Model [46.293854851116215]
We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized motion driven by content texts and style sequences.
Our proposed framework outperforms existing methods in stylized motion generation.
arXiv Detail & Related papers (2024-07-17T17:59:42Z) - Infinite Motion: Extended Motion Generation via Long Text Instructions [51.61117351997808]
"Infinite Motion" is a novel approach that leverages long text to extended motion generation.
Key innovation of our model is its ability to accept arbitrary lengths of text as input.
We incorporate the timestamp design for text which allows precise editing of local segments within the generated sequences.
arXiv Detail & Related papers (2024-07-11T12:33:56Z) - WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds [23.884105024013714]
We present a new approach for understanding the periodicity structure and semantics of motion datasets.
We learn a shared phase manifold for multiple characters, such as a human and a dog, without any supervision.
In combination with an improved motion matching framework, we demonstrate the manifold's capability of timing and semantics alignment in several applications.
arXiv Detail & Related papers (2024-07-11T09:31:05Z) - MotionCrafter: One-Shot Motion Customization of Diffusion Models [66.44642854791807]
We introduce MotionCrafter, a one-shot instance-guided motion customization method.
MotionCrafter employs a parallel spatial-temporal architecture that injects the reference motion into the temporal component of the base model.
During training, a frozen base model provides appearance normalization, effectively separating appearance from motion.
arXiv Detail & Related papers (2023-12-08T16:31:04Z) - DiffusionPhase: Motion Diffusion in Frequency Domain [69.811762407278]
We introduce a learning-based method for generating high-quality human motion sequences from text descriptions.
Existing techniques struggle with motion diversity and smooth transitions in generating arbitrary-length motion sequences.
We develop a network encoder that converts the motion space into a compact yet expressive parameterized phase space.
arXiv Detail & Related papers (2023-12-07T04:39:22Z) - MoDi: Unconditional Motion Synthesis from Diverse Data [51.676055380546494]
We present MoDi, an unconditional generative model that synthesizes diverse motions.
Our model is trained in a completely unsupervised setting from a diverse, unstructured and unlabeled motion dataset.
We show that despite the lack of any structure in the dataset, the latent space can be semantically clustered.
arXiv Detail & Related papers (2022-06-16T09:06:25Z) - Neural Marionette: Unsupervised Learning of Motion Skeleton and Latent
Dynamics from Volumetric Video [5.456297943378056]
We present Neural Marionette, an unsupervised approach that discovers the skeletal structure from a dynamic sequence.
We demonstrate that the discovered structure is even comparable to the hand-labeled ground truth in skeleton representing a 4D sequence of motion.
arXiv Detail & Related papers (2022-02-17T02:44:16Z) - Motion Puzzle: Arbitrary Motion Style Transfer by Body Part [6.206196935093063]
Motion Puzzle is a novel motion style transfer network that advances the state-of-the-art in several important respects.
Our framework extracts style features from multiple style motions for different body parts and transfers them locally to the target body parts.
It can capture styles exhibited by dynamic movements, such as flapping and staggering, significantly better than previous work.
arXiv Detail & Related papers (2022-02-10T19:56:46Z) - Hierarchical Style-based Networks for Motion Synthesis [150.226137503563]
We propose a self-supervised method for generating long-range, diverse and plausible behaviors to achieve a specific goal location.
Our proposed method learns to model the motion of human by decomposing a long-range generation task in a hierarchical manner.
On large-scale skeleton dataset, we show that the proposed method is able to synthesise long-range, diverse and plausible motion.
arXiv Detail & Related papers (2020-08-24T02:11:02Z) - Euclideanizing Flows: Diffeomorphic Reduction for Learning Stable
Dynamical Systems [74.80320120264459]
We present an approach to learn such motions from a limited number of human demonstrations.
The complex motions are encoded as rollouts of a stable dynamical system.
The efficacy of this approach is demonstrated through validation on an established benchmark as well demonstrations collected on a real-world robotic system.
arXiv Detail & Related papers (2020-05-27T03:51:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.