Adaptively-Realistic Image Generation from Stroke and Sketch with
Diffusion Model
- URL: http://arxiv.org/abs/2208.12675v1
- Date: Fri, 26 Aug 2022 13:59:26 GMT
- Title: Adaptively-Realistic Image Generation from Stroke and Sketch with
Diffusion Model
- Authors: Shin-I Cheng, Yu-Jie Chen, Wei-Chen Chiu, Hsin-Ying Lee, Hung-Yu Tseng
- Abstract summary: We propose a unified framework supporting a three-dimensional control over the image synthesis from sketches and strokes based on diffusion models.
Our framework achieves state-of-the-art performance while providing flexibility in generating customized images with control over shape, color, and realism.
Our method unleashes applications such as editing on real images, generation with partial sketches and strokes, and multi-domain multi-modal synthesis.
- Score: 31.652827838300915
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating images from hand-drawings is a crucial and fundamental task in
content creation. The translation is difficult as there exist infinite
possibilities and the different users usually expect different outcomes.
Therefore, we propose a unified framework supporting a three-dimensional
control over the image synthesis from sketches and strokes based on diffusion
models. Users can not only decide the level of faithfulness to the input
strokes and sketches, but also the degree of realism, as the user inputs are
usually not consistent with the real images. Qualitative and quantitative
experiments demonstrate that our framework achieves state-of-the-art
performance while providing flexibility in generating customized images with
control over shape, color, and realism. Moreover, our method unleashes
applications such as editing on real images, generation with partial sketches
and strokes, and multi-domain multi-modal synthesis.
Related papers
- VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model [76.02314305164595]
This work presents a novel image outpainting framework that is capable of customizing the results according to the requirement of users.
We take advantage of a Multimodal Large Language Model (MLLM) that automatically extracts and organizes the corresponding textual descriptions of the masked and unmasked part of a given image.
In addition, a special Cross-Attention module, namely Center-Total-Surrounding (CTS), is elaborately designed to enhance further the the interaction between specific space regions of the image and corresponding parts of the text prompts.
arXiv Detail & Related papers (2024-06-03T07:14:19Z) - VisioBlend: Sketch and Stroke-Guided Denoising Diffusion Probabilistic Model for Realistic Image Generation [0.0]
We propose a unified framework supporting three-dimensional control over image synthesis from sketches and strokes based on diffusion models.
It enables users to decide the level of faithfulness to the input strokes and sketches.
It solves the problem of data availability by synthesizing new data points from hand-drawn sketches and strokes.
arXiv Detail & Related papers (2024-05-15T11:27:27Z) - DiffSketching: Sketch Control Image Synthesis with Diffusion Models [10.172753521953386]
Deep learning models for sketch-to-image synthesis need to overcome the distorted input sketch without visual details.
Our model matches sketches through the cross domain constraints, and uses a classifier to guide the image synthesis more accurately.
Our model can beat GAN-based method in terms of generation quality and human evaluation, and does not rely on massive sketch-image datasets.
arXiv Detail & Related papers (2023-05-30T07:59:23Z) - Reference-based Image Composition with Sketch via Structure-aware
Diffusion Model [38.1193912666578]
We introduce a multi-input-conditioned image composition model that incorporates a sketch as a novel modal, alongside a reference image.
Thanks to the edge-level controllability using sketches, our method enables a user to edit or complete an image sub-part.
Our framework fine-tunes a pre-trained diffusion model to complete missing regions using the reference image while maintaining sketch guidance.
arXiv Detail & Related papers (2023-03-31T06:12:58Z) - Single Stage Virtual Try-on via Deformable Attention Flows [51.70606454288168]
Virtual try-on aims to generate a photo-realistic fitting result given an in-shop garment and a reference person image.
We develop a novel Deformable Attention Flow (DAFlow) which applies the deformable attention scheme to multi-flow estimation.
Our proposed method achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-07-19T10:01:31Z) - Effect of Instance Normalization on Fine-Grained Control for
Sketch-Based Face Image Generation [17.31312721810532]
We investigate the effect of instance normalization on generating photorealistic face images from hand-drawn sketches.
Based on the visual analysis, we modify the instance normalization layers in the baseline image translation model.
We elaborate a new set of hand-drawn sketches with 11 categories of specially designed changes and conduct extensive experimental analysis.
arXiv Detail & Related papers (2022-07-17T04:05:17Z) - Generating Person Images with Appearance-aware Pose Stylizer [66.44220388377596]
We present a novel end-to-end framework to generate realistic person images based on given person poses and appearances.
The core of our framework is a novel generator called Appearance-aware Pose Stylizer (APS) which generates human images by coupling the target pose with the conditioned person appearance progressively.
arXiv Detail & Related papers (2020-07-17T15:58:05Z) - Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image
Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties.
Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z) - Sketch-Guided Scenery Image Outpainting [83.6612152173028]
We propose an encoder-decoder based network to conduct sketch-guided outpainting.
We apply a holistic alignment module to make the synthesized part be similar to the real one from the global view.
Second, we reversely produce the sketches from the synthesized part and encourage them be consistent with the ground-truth ones.
arXiv Detail & Related papers (2020-06-17T11:34:36Z) - Deep Plastic Surgery: Robust and Controllable Image Editing with
Human-Drawn Sketches [133.01690754567252]
Sketch-based image editing aims to synthesize and modify photos based on the structural information provided by the human-drawn sketches.
Deep Plastic Surgery is a novel, robust and controllable image editing framework that allows users to interactively edit images using hand-drawn sketch inputs.
arXiv Detail & Related papers (2020-01-09T08:57:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.