MoCaNet: Motion Retargeting in-the-wild via Canonicalization Networks
- URL: http://arxiv.org/abs/2112.10082v2
- Date: Tue, 21 Dec 2021 09:16:15 GMT
- Title: MoCaNet: Motion Retargeting in-the-wild via Canonicalization Networks
- Authors: Wentao Zhu, Zhuoqian Yang, Ziang Di, Wayne Wu, Yizhou Wang, Chen
Change Loy
- Abstract summary: We present a novel framework that brings the 3D motion task from controlled environments to in-the-wild scenarios.
It is capable of body motion from a character in a 2D monocular video to a 3D character without using any motion capture system or 3D reconstruction procedure.
- Score: 77.56526918859345
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We present a novel framework that brings the 3D motion retargeting task from
controlled environments to in-the-wild scenarios. In particular, our method is
capable of retargeting body motion from a character in a 2D monocular video to
a 3D character without using any motion capture system or 3D reconstruction
procedure. It is designed to leverage massive online videos for unsupervised
training, needless of 3D annotations or motion-body pairing information. The
proposed method is built upon two novel canonicalization operations, structure
canonicalization and view canonicalization. Trained with the canonicalization
operations and the derived regularizations, our method learns to factorize a
skeleton sequence into three independent semantic subspaces, i.e., motion,
structure, and view angle. The disentangled representation enables motion
retargeting from 2D to 3D with high precision. Our method achieves superior
performance on motion transfer benchmarks with large body variations and
challenging actions. Notably, the canonicalized skeleton sequence could serve
as a disentangled and interpretable representation of human motion that
benefits action analysis and motion retrieval.
Related papers
- Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - Shape of Motion: 4D Reconstruction from a Single Video [51.04575075620677]
We introduce a method capable of reconstructing generic dynamic scenes, featuring explicit, full-sequence-long 3D motion.
We exploit the low-dimensional structure of 3D motion by representing scene motion with a compact set of SE3 motion bases.
Our method achieves state-of-the-art performance for both long-range 3D/2D motion estimation and novel view synthesis on dynamic scenes.
arXiv Detail & Related papers (2024-07-18T17:59:08Z) - Generating Continual Human Motion in Diverse 3D Scenes [56.70255926954609]
We introduce a method to synthesize animator guided human motion across 3D scenes.
We decompose the continual motion synthesis problem into walking along paths and transitioning in and out of the actions specified by the keypoints.
Our model can generate long sequences of diverse actions such as grabbing, sitting and leaning chained together.
arXiv Detail & Related papers (2023-04-04T18:24:22Z) - MotionBERT: A Unified Perspective on Learning Human Motion
Representations [46.67364057245364]
We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources.
We propose a pretraining stage in which a motion encoder is trained to recover the underlying 3D motion from noisy partial 2D observations.
We implement motion encoder with a Dual-stream Spatio-temporal Transformer (DSTformer) neural network.
arXiv Detail & Related papers (2022-10-12T19:46:25Z) - Action2video: Generating Videos of Human 3D Actions [31.665831044217363]
We aim to tackle the interesting yet challenging problem of generating videos of diverse and natural human motions from prescribed action categories.
Key issue lies in the ability to synthesize multiple distinct motion sequences that are realistic in their visual appearances.
Action2motionally generates plausible 3D pose sequences of a prescribed action category, which are processed and rendered by motion2video to form 2D videos.
arXiv Detail & Related papers (2021-11-12T20:20:37Z) - Neural Monocular 3D Human Motion Capture with Physical Awareness [76.55971509794598]
We present a new trainable system for physically plausible markerless 3D human motion capture.
Unlike most neural methods for human motion capture, our approach is aware of physical and environmental constraints.
It produces smooth and physically principled 3D motions in an interactive frame rate in a wide variety of challenging scenes.
arXiv Detail & Related papers (2021-05-03T17:57:07Z) - Learning monocular 3D reconstruction of articulated categories from
motion [39.811816510186475]
Video self-supervision forces the consistency of consecutive 3D reconstructions by a motion-based cycle loss.
We introduce an interpretable model of 3D template deformations that controls a 3D surface through the displacement of a small number of local, learnable handles.
We obtain state-of-the-art reconstructions with diverse shapes, viewpoints and textures for multiple articulated object categories.
arXiv Detail & Related papers (2021-03-30T13:50:27Z) - Motion Guided 3D Pose Estimation from Videos [81.14443206968444]
We propose a new loss function, called motion loss, for the problem of monocular 3D Human pose estimation from 2D pose.
In computing motion loss, a simple yet effective representation for keypoint motion, called pairwise motion encoding, is introduced.
We design a new graph convolutional network architecture, U-shaped GCN (UGCN), which captures both short-term and long-term motion information.
arXiv Detail & Related papers (2020-04-29T06:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.