Sketch2Human: Deep Human Generation with Disentangled Geometry and Appearance Control
- URL: http://arxiv.org/abs/2404.15889v1
- Date: Wed, 24 Apr 2024 14:24:57 GMT
- Title: Sketch2Human: Deep Human Generation with Disentangled Geometry and Appearance Control
- Authors: Linzi Qu, Jiaxiang Shang, Hui Ye, Xiaoguang Han, Hongbo Fu,
- Abstract summary: This work presents Sketch2Human, the first system for controllable full-body human image generation guided by a semantic sketch.
We present a sketch encoder trained with a large synthetic dataset sampled from StyleGAN-Human's latent space.
Although our method is trained with synthetic data, it can handle hand-drawn sketches as well.
- Score: 27.23770287587972
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Geometry- and appearance-controlled full-body human image generation is an interesting but challenging task. Existing solutions are either unconditional or dependent on coarse conditions (e.g., pose, text), thus lacking explicit geometry and appearance control of body and garment. Sketching offers such editing ability and has been adopted in various sketch-based face generation and editing solutions. However, directly adapting sketch-based face generation to full-body generation often fails to produce high-fidelity and diverse results due to the high complexity and diversity in the pose, body shape, and garment shape and texture. Recent geometrically controllable diffusion-based methods mainly rely on prompts to generate appearance and it is hard to balance the realism and the faithfulness of their results to the sketch when the input is coarse. This work presents Sketch2Human, the first system for controllable full-body human image generation guided by a semantic sketch (for geometry control) and a reference image (for appearance control). Our solution is based on the latent space of StyleGAN-Human with inverted geometry and appearance latent codes as input. Specifically, we present a sketch encoder trained with a large synthetic dataset sampled from StyleGAN-Human's latent space and directly supervised by sketches rather than real images. Considering the entangled information of partial geometry and texture in StyleGAN-Human and the absence of disentangled datasets, we design a novel training scheme that creates geometry-preserved and appearance-transferred training data to tune a generator to achieve disentangled geometry and appearance control. Although our method is trained with synthetic data, it can handle hand-drawn sketches as well. Qualitative and quantitative evaluations demonstrate the superior performance of our method to state-of-the-art methods.
Related papers
- Human-Inspired Facial Sketch Synthesis with Dynamic Adaptation [25.293899668984018]
Facial sketch synthesis (FSS) aims to generate a vivid sketch portrait from a given facial photo.
In this paper, we propose a novel Human-Inspired Dynamic Adaptation (HIDA) method.
We show that HIDA can generate high-quality sketches in multiple styles, and significantly outperforms previous methods.
arXiv Detail & Related papers (2023-09-01T02:27:05Z) - DiffFaceSketch: High-Fidelity Face Image Synthesis with Sketch-Guided
Latent Diffusion Model [8.1818090854822]
We introduce a Sketch-Guided Latent Diffusion Model (SGLDM), an LDM-based network architect trained on a paired sketch-face dataset.
SGLDM can synthesize high-quality face images with different expressions, facial accessories, and hairstyles from various sketches with different abstraction levels.
arXiv Detail & Related papers (2023-02-14T08:51:47Z) - DeepPortraitDrawing: Generating Human Body Images from Freehand Sketches [75.4318318890065]
We present DeepDrawing, a framework for converting roughly drawn sketches to realistic human body images.
To encode complicated body shapes under various poses, we take a local-to-global approach.
Our method produces more realistic images than the state-of-the-art sketch-to-image synthesis techniques.
arXiv Detail & Related papers (2022-05-04T14:02:45Z) - Generalizable Neural Performer: Learning Robust Radiance Fields for
Human Novel View Synthesis [52.720314035084215]
This work targets at using a general deep learning framework to synthesize free-viewpoint images of arbitrary human performers.
We present a simple yet powerful framework, named Generalizable Neural Performer (GNR), that learns a generalizable and robust neural body representation.
Experiments on GeneBody-1.0 and ZJU-Mocap show better robustness of our methods than recent state-of-the-art generalizable methods.
arXiv Detail & Related papers (2022-04-25T17:14:22Z) - Detailed Avatar Recovery from Single Image [50.82102098057822]
This paper presents a novel framework to recover emphdetailed avatar from a single image.
We use the deep neural networks to refine the 3D shape in a Hierarchical Mesh Deformation framework.
Our method can restore detailed human body shapes with complete textures beyond skinned models.
arXiv Detail & Related papers (2021-08-06T03:51:26Z) - SimpModeling: Sketching Implicit Field to Guide Mesh Modeling for 3D
Animalmorphic Head Design [40.821865912127635]
We propose SimpModeling, a novel sketch-based system for helping users, especially amateur users, easily model 3D animalmorphic heads.
We use the advanced implicit-based shape inference methods, which have strong ability to handle the domain gap between freehand sketches and synthetic ones used for training.
We also contribute to a dataset of high-quality 3D animal heads, which are manually created by artists.
arXiv Detail & Related papers (2021-08-05T12:17:36Z) - Neural Actor: Neural Free-view Synthesis of Human Actors with Pose
Control [80.79820002330457]
We propose a new method for high-quality synthesis of humans from arbitrary viewpoints and under arbitrary controllable poses.
Our method achieves better quality than the state-of-the-arts on playback as well as novel pose synthesis, and can even generalize well to new poses that starkly differ from the training poses.
arXiv Detail & Related papers (2021-06-03T17:40:48Z) - DeepFacePencil: Creating Face Images from Freehand Sketches [77.00929179469559]
Existing image-to-image translation methods require a large-scale dataset of paired sketches and images for supervision.
We propose DeepFacePencil, an effective tool that is able to generate photo-realistic face images from hand-drawn sketches.
arXiv Detail & Related papers (2020-08-31T03:35:21Z) - Unsupervised Shape and Pose Disentanglement for 3D Meshes [49.431680543840706]
We present a simple yet effective approach to learn disentangled shape and pose representations in an unsupervised setting.
We use a combination of self-consistency and cross-consistency constraints to learn pose and shape space from registered meshes.
We demonstrate the usefulness of learned representations through a number of tasks including pose transfer and shape retrieval.
arXiv Detail & Related papers (2020-07-22T11:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.