Grasping the Arrow of Time from the Singularity: Decoding Micromotion in
Low-dimensional Latent Spaces from StyleGAN
- URL: http://arxiv.org/abs/2204.12696v1
- Date: Wed, 27 Apr 2022 04:38:39 GMT
- Title: Grasping the Arrow of Time from the Singularity: Decoding Micromotion in
Low-dimensional Latent Spaces from StyleGAN
- Authors: Qiucheng Wu, Yifan Jiang, Junru Wu, Kai Wang, Gong Zhang, Humphrey
Shi, Zhangyang Wang, Shiyu Chang
- Abstract summary: We show that "micromotion" can be represented in low-rank spaces extracted from latent space of StyleGAN-v2 model for face generation.
It can be represented as simple as an affine transformation over its latent feature.
It demonstrates that the local feature geometry corresponding to one type of micromotion is aligned across different face subjects.
- Score: 105.99762358450633
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The disentanglement of StyleGAN latent space has paved the way for realistic
and controllable image editing, but does StyleGAN know anything about temporal
motion, as it was only trained on static images? To study the motion features
in the latent space of StyleGAN, in this paper, we hypothesize and demonstrate
that a series of meaningful, natural, and versatile small, local movements
(referred to as "micromotion", such as expression, head movement, and aging
effect) can be represented in low-rank spaces extracted from the latent space
of a conventionally pre-trained StyleGAN-v2 model for face generation, with the
guidance of proper "anchors" in the form of either short text or video clips.
Starting from one target face image, with the editing direction decoded from
the low-rank space, its micromotion features can be represented as simple as an
affine transformation over its latent feature. Perhaps more surprisingly, such
micromotion subspace, even learned from just single target face, can be
painlessly transferred to other unseen face images, even those from vastly
different domains (such as oil painting, cartoon, and sculpture faces). It
demonstrates that the local feature geometry corresponding to one type of
micromotion is aligned across different face subjects, and hence that
StyleGAN-v2 is indeed "secretly" aware of the subject-disentangled feature
variations caused by that micromotion. We present various successful examples
of applying our low-dimensional micromotion subspace technique to directly and
effortlessly manipulate faces, showing high robustness, low computational
overhead, and impressive domain transferability. Our codes are available at
https://github.com/wuqiuche/micromotion-StyleGAN.
Related papers
- MotionCrafter: One-Shot Motion Customization of Diffusion Models [66.44642854791807]
We introduce MotionCrafter, a one-shot instance-guided motion customization method.
MotionCrafter employs a parallel spatial-temporal architecture that injects the reference motion into the temporal component of the base model.
During training, a frozen base model provides appearance normalization, effectively separating appearance from motion.
arXiv Detail & Related papers (2023-12-08T16:31:04Z) - We never go out of Style: Motion Disentanglement by Subspace
Decomposition of Latent Space [38.54517335215281]
We propose a novel method to decompose motion in videos by using a pretrained image GAN model.
We discover disentangled motion subspaces in the latent space of widely used style-based GAN models.
We evaluate the disentanglement properties of motion subspaces on face and car datasets.
arXiv Detail & Related papers (2023-06-01T11:18:57Z) - StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces [103.54337984566877]
We use dilated convolutions to rescale the receptive fields of shallow layers in StyleGAN without altering any model parameters.
This allows fixed-size small features at shallow layers to be extended into larger ones that can accommodate variable resolutions.
We validate our method using unaligned face inputs of various resolutions in a diverse set of face manipulation tasks.
arXiv Detail & Related papers (2023-03-10T18:59:33Z) - Detection of (Hidden) Emotions from Videos using Muscles Movements and
Face Manifold Embedding [0.0]
We provide a new non-invasive, easy-to-scale method for (hidden) emotion detection from videos of human faces.
Our approach combines face manifold detection for accurate location of the face in the video with local face manifold embedding.
In the next step, we employ the Digital Image Speckle Correlation (DISC) and the optical flow algorithm to compute the pattern of micro-movements in the face.
arXiv Detail & Related papers (2022-11-01T02:48:35Z) - Video2StyleGAN: Encoding Video in Latent Space for Manipulation [63.03250800510085]
We propose a novel network to encode face videos into the latent space of StyleGAN for semantic face video manipulation.
Our approach can significantly outperform existing single image methods, while achieving real-time (66 fps) speed.
arXiv Detail & Related papers (2022-06-27T06:48:15Z) - Video2StyleGAN: Disentangling Local and Global Variations in a Video [68.70889857355678]
StyleGAN has emerged as a powerful paradigm for facial editing, providing disentangled controls over age, expression, illumination, etc.
We introduce Video2StyleGAN that takes a target image and driving video(s) to reenact the local and global locations and expressions from the driving video in the identity of the target image.
arXiv Detail & Related papers (2022-05-27T14:18:19Z) - Lagrangian Motion Magnification with Double Sparse Optical Flow
Decomposition [2.1028463367241033]
We propose a novel approach for local Lagrangian motion magnification of facial micro-motions.
Our contribution is three-fold: first, we fine tune the recurrent all-pairs field transforms (RAFT) for OFs deep learning approach for faces.
Second, since facial micro-motions are both local in space and time, we propose to approximate the OF field by sparse components both in space and time leading to a double sparse decomposition.
arXiv Detail & Related papers (2022-04-15T20:24:11Z) - PIE: Portrait Image Embedding for Semantic Control [82.69061225574774]
We present the first approach for embedding real portrait images in the latent space of StyleGAN.
We use StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN.
An identity energy preservation term allows spatially coherent edits while maintaining facial integrity.
arXiv Detail & Related papers (2020-09-20T17:53:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.