Self-Supervised Sketch-to-Image Synthesis
- URL: http://arxiv.org/abs/2012.09290v2
- Date: Tue, 22 Dec 2020 20:40:27 GMT
- Title: Self-Supervised Sketch-to-Image Synthesis
- Authors: Bingchen Liu, Yizhe Zhu, Kunpeng Song, Ahmed Elgammal
- Abstract summary: We study the exemplar-based sketch-to-image (s2i) synthesis task in a self-supervised learning manner.
We first propose an unsupervised method to efficiently synthesize line-sketches for general RGB-only datasets.
We then present a self-supervised Auto-Encoder (AE) to decouple the content/style features from sketches and RGB-images, and synthesize images that are both content-faithful to the sketches and style-consistent to the RGB-images.
- Score: 21.40315235087551
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Imagining a colored realistic image from an arbitrarily drawn sketch is one
of the human capabilities that we eager machines to mimic. Unlike previous
methods that either requires the sketch-image pairs or utilize low-quantity
detected edges as sketches, we study the exemplar-based sketch-to-image (s2i)
synthesis task in a self-supervised learning manner, eliminating the necessity
of the paired sketch data. To this end, we first propose an unsupervised method
to efficiently synthesize line-sketches for general RGB-only datasets. With the
synthetic paired-data, we then present a self-supervised Auto-Encoder (AE) to
decouple the content/style features from sketches and RGB-images, and
synthesize images that are both content-faithful to the sketches and
style-consistent to the RGB-images. While prior works employ either the
cycle-consistence loss or dedicated attentional modules to enforce the
content/style fidelity, we show AE's superior performance with pure
self-supervisions. To further improve the synthesis quality in high resolution,
we also leverage an adversarial network to refine the details of synthetic
images. Extensive experiments on 1024*1024 resolution demonstrate a new
state-of-art-art performance of the proposed model on CelebA-HQ and Wiki-Art
datasets. Moreover, with the proposed sketch generator, the model shows a
promising performance on style mixing and style transfer, which require
synthesized images to be both style-consistent and semantically meaningful. Our
code is available on
https://github.com/odegeasslbc/Self-Supervised-Sketch-to-Image-Synthesis-PyTorch,
and please visit https://create.playform.io/my-projects?mode=sketch for an
online demo of our model.
Related papers
- Learning Vision from Models Rivals Learning Vision from Data [54.43596959598465]
We introduce SynCLR, a novel approach for learning visual representations exclusively from synthetic images and synthetic captions.
We synthesize a large dataset of image captions using LLMs, then use an off-the-shelf text-to-image model to generate multiple images corresponding to each synthetic caption.
We perform visual representation learning on these synthetic images via contrastive learning, treating images sharing the same caption as positive pairs.
arXiv Detail & Related papers (2023-12-28T18:59:55Z) - DiffSketching: Sketch Control Image Synthesis with Diffusion Models [10.172753521953386]
Deep learning models for sketch-to-image synthesis need to overcome the distorted input sketch without visual details.
Our model matches sketches through the cross domain constraints, and uses a classifier to guide the image synthesis more accurately.
Our model can beat GAN-based method in terms of generation quality and human evaluation, and does not rely on massive sketch-image datasets.
arXiv Detail & Related papers (2023-05-30T07:59:23Z) - Text-Guided Scene Sketch-to-Photo Synthesis [5.431298869139175]
We propose a method for scene-level sketch-to-photo synthesis with text guidance.
To train our model, we use self-supervised learning from a set of photographs.
Experiments show that the proposed method translates original sketch images that are not extracted from color images into photos with compelling visual quality.
arXiv Detail & Related papers (2023-02-14T08:13:36Z) - A Shared Representation for Photorealistic Driving Simulators [83.5985178314263]
We propose to improve the quality of generated images by rethinking the discriminator architecture.
The focus is on the class of problems where images are generated given semantic inputs, such as scene segmentation maps or human body poses.
We aim to learn a shared latent representation that encodes enough information to jointly do semantic segmentation, content reconstruction, along with a coarse-to-fine grained adversarial reasoning.
arXiv Detail & Related papers (2021-12-09T18:59:21Z) - SSH: A Self-Supervised Framework for Image Harmonization [97.16345684998788]
We propose a novel Self-Supervised Harmonization framework (SSH) that can be trained using just "free" natural images without being edited.
Our results show that the proposedSSH outperforms previous state-of-the-art methods in terms of reference metrics, visual quality, and subject user study.
arXiv Detail & Related papers (2021-08-15T19:51:33Z) - Sketch-Guided Scenery Image Outpainting [83.6612152173028]
We propose an encoder-decoder based network to conduct sketch-guided outpainting.
We apply a holistic alignment module to make the synthesized part be similar to the real one from the global view.
Second, we reversely produce the sketches from the synthesized part and encourage them be consistent with the ground-truth ones.
arXiv Detail & Related papers (2020-06-17T11:34:36Z) - Quality Guided Sketch-to-Photo Image Synthesis [12.617078020344618]
We propose a generative adversarial network that synthesizes a single sketch into multiple synthetic images with unique attributes like hair color, sex, etc.
Our approach is aimed at improving the visual appeal of the synthesised images while incorporating multiple attribute assignment to the generator without compromising the identity of the synthesised image.
arXiv Detail & Related papers (2020-04-20T16:00:01Z) - SketchyCOCO: Image Generation from Freehand Scene Sketches [71.85577739612579]
We introduce the first method for automatic image generation from scene-level freehand sketches.
Key contribution is an attribute vector bridged Geneversarative Adrial Network called EdgeGAN.
We have built a large-scale composite dataset called SketchyCOCO to support and evaluate the solution.
arXiv Detail & Related papers (2020-03-05T14:54:10Z) - Deep Plastic Surgery: Robust and Controllable Image Editing with
Human-Drawn Sketches [133.01690754567252]
Sketch-based image editing aims to synthesize and modify photos based on the structural information provided by the human-drawn sketches.
Deep Plastic Surgery is a novel, robust and controllable image editing framework that allows users to interactively edit images using hand-drawn sketch inputs.
arXiv Detail & Related papers (2020-01-09T08:57:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.