DiffFaceSketch: High-Fidelity Face Image Synthesis with Sketch-Guided
Latent Diffusion Model
- URL: http://arxiv.org/abs/2302.06908v1
- Date: Tue, 14 Feb 2023 08:51:47 GMT
- Title: DiffFaceSketch: High-Fidelity Face Image Synthesis with Sketch-Guided
Latent Diffusion Model
- Authors: Yichen Peng, Chunqi Zhao, Haoran Xie, Tsukasa Fukusato, and Kazunori
Miyata
- Abstract summary: We introduce a Sketch-Guided Latent Diffusion Model (SGLDM), an LDM-based network architect trained on a paired sketch-face dataset.
SGLDM can synthesize high-quality face images with different expressions, facial accessories, and hairstyles from various sketches with different abstraction levels.
- Score: 8.1818090854822
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Synthesizing face images from monochrome sketches is one of the most
fundamental tasks in the field of image-to-image translation. However, it is
still challenging to (1)~make models learn the high-dimensional face features
such as geometry and color, and (2)~take into account the characteristics of
input sketches. Existing methods often use sketches as indirect inputs (or as
auxiliary inputs) to guide the models, resulting in the loss of sketch features
or the alteration of geometry information. In this paper, we introduce a
Sketch-Guided Latent Diffusion Model (SGLDM), an LDM-based network architect
trained on the paired sketch-face dataset. We apply a Multi-Auto-Encoder (AE)
to encode the different input sketches from different regions of a face from
pixel space to a feature map in latent space, which enables us to reduce the
dimension of the sketch input while preserving the geometry-related information
of local face details. We build a sketch-face paired dataset based on the
existing method that extracts the edge map from an image. We then introduce a
Stochastic Region Abstraction (SRA), an approach to augment our dataset to
improve the robustness of SGLDM to handle sketch input with arbitrary
abstraction. The evaluation study shows that SGLDM can synthesize high-quality
face images with different expressions, facial accessories, and hairstyles from
various sketches with different abstraction levels.
Related papers
- Sketch-guided Image Inpainting with Partial Discrete Diffusion Process [5.005162730122933]
We introduce a novel partial discrete diffusion process (PDDP) for sketch-guided inpainting.
PDDP corrupts the masked regions of the image and reconstructs these masked regions conditioned on hand-drawn sketches.
The proposed novel transformer module accepts two inputs -- the image containing the masked region to be inpainted and the query sketch to model the reverse diffusion process.
arXiv Detail & Related papers (2024-04-18T07:07:38Z) - SENS: Part-Aware Sketch-based Implicit Neural Shape Modeling [124.3266213819203]
We present SENS, a novel method for generating and editing 3D models from hand-drawn sketches.
S SENS analyzes the sketch and encodes its parts into ViT patch encoding.
S SENS supports refinement via part reconstruction, allowing for nuanced adjustments and artifact removal.
arXiv Detail & Related papers (2023-06-09T17:50:53Z) - DiffSketching: Sketch Control Image Synthesis with Diffusion Models [10.172753521953386]
Deep learning models for sketch-to-image synthesis need to overcome the distorted input sketch without visual details.
Our model matches sketches through the cross domain constraints, and uses a classifier to guide the image synthesis more accurately.
Our model can beat GAN-based method in terms of generation quality and human evaluation, and does not rely on massive sketch-image datasets.
arXiv Detail & Related papers (2023-05-30T07:59:23Z) - Learning Geometry-aware Representations by Sketching [20.957964436294873]
We propose learning to represent a scene by sketching, inspired by human behavior.
Our method, coined Learning by Sketching (LBS), learns to convert an image into a set of colored strokes that explicitly incorporate the geometric information of the scene.
arXiv Detail & Related papers (2023-04-17T12:23:32Z) - Sketch-Guided Text-to-Image Diffusion Models [57.12095262189362]
We introduce a universal approach to guide a pretrained text-to-image diffusion model.
Our method does not require to train a dedicated model or a specialized encoder for the task.
We take a particular focus on the sketch-to-image translation task, revealing a robust and expressive way to generate images.
arXiv Detail & Related papers (2022-11-24T18:45:32Z) - Facial Geometric Detail Recovery via Implicit Representation [147.07961322377685]
We present a robust texture-guided geometric detail recovery approach using only a single in-the-wild facial image.
Our method combines high-quality texture completion with the powerful expressiveness of implicit surfaces.
Our method not only recovers accurate facial details but also decomposes normals, albedos, and shading parts in a self-supervised way.
arXiv Detail & Related papers (2022-03-18T01:42:59Z) - Shape My Face: Registering 3D Face Scans by Surface-to-Surface
Translation [75.59415852802958]
Shape-My-Face (SMF) is a powerful encoder-decoder architecture based on an improved point cloud encoder, a novel visual attention mechanism, graph convolutional decoders with skip connections, and a specialized mouth model.
Our model provides topologically-sound meshes with minimal supervision, offers faster training time, has orders of magnitude fewer trainable parameters, is more robust to noise, and can generalize to previously unseen datasets.
arXiv Detail & Related papers (2020-12-16T20:02:36Z) - DeepFacePencil: Creating Face Images from Freehand Sketches [77.00929179469559]
Existing image-to-image translation methods require a large-scale dataset of paired sketches and images for supervision.
We propose DeepFacePencil, an effective tool that is able to generate photo-realistic face images from hand-drawn sketches.
arXiv Detail & Related papers (2020-08-31T03:35:21Z) - Deep Generation of Face Images from Sketches [36.146494762987146]
Deep image-to-image translation techniques allow fast generation of face images from freehand sketches.
Existing solutions tend to overfit to sketches, thus requiring professional sketches or even edge maps as input.
We propose to implicitly model the shape space of plausible face images and synthesize a face image in this space to approximate an input sketch.
Our method essentially uses input sketches as soft constraints and is thus able to produce high-quality face images even from rough and/or incomplete sketches.
arXiv Detail & Related papers (2020-06-01T16:20:23Z) - SketchDesc: Learning Local Sketch Descriptors for Multi-view
Correspondence [68.63311821718416]
We study the problem of multi-view sketch correspondence, where we take as input multiple freehand sketches with different views of the same object.
This problem is challenging since the visual features of corresponding points at different views can be very different.
We take a deep learning approach and learn a novel local sketch descriptor from data.
arXiv Detail & Related papers (2020-01-16T11:31:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.