Related papers: Block and Detail: Scaffolding Sketch-to-Image Generation

Block and Detail: Scaffolding Sketch-to-Image Generation

URL: http://arxiv.org/abs/2402.18116v2
Date: Fri, 25 Oct 2024 17:35:05 GMT
Title: Block and Detail: Scaffolding Sketch-to-Image Generation
Authors: Vishnu Sarukkai, Lu Yuan, Mia Tang, Maneesh Agrawala, Kayvon Fatahalian,
Abstract summary: We introduce a novel sketch-to-image tool that aligns with the iterative refinement process of artists. Our tool lets users sketch blocking strokes to coarsely represent the placement and form of objects and detail strokes to refine their shape and silhouettes. We develop a two-pass algorithm for generating high-fidelity images from such sketches at any point in the iterative process.
Score: 65.56590359051634
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a novel sketch-to-image tool that aligns with the iterative refinement process of artists. Our tool lets users sketch blocking strokes to coarsely represent the placement and form of objects and detail strokes to refine their shape and silhouettes. We develop a two-pass algorithm for generating high-fidelity images from such sketches at any point in the iterative process. In the first pass we use a ControlNet to generate an image that strictly follows all the strokes (blocking and detail) and in the second pass we add variation by renoising regions surrounding blocking strokes. We also present a dataset generation scheme that, when used to train a ControlNet architecture, allows regions that do not contain strokes to be interpreted as not-yet-specified regions rather than empty space. We show that this partial-sketch-aware ControlNet can generate coherent elements from partial sketches that only contain a small number of strokes. The high-fidelity images produced by our approach serve as scaffolds that can help the user adjust the shape and proportions of objects or add additional elements to the composition. We demonstrate the effectiveness of our approach with a variety of examples and evaluative comparisons. Quantitatively, evaluative user feedback indicates that novice viewers prefer the quality of images from our algorithm over a baseline Scribble ControlNet for 84% of the pairs and found our images had less distortion in 81% of the pairs.

Related papers

Let Human Sketches Help: Empowering Challenging Image Segmentation Task with Freehand Sketches [11.361057754047344]
We introduce an innovative sketch-guided interactive segmentation framework. We demonstrate that sketch input can significantly improve performance in existing iterative segmentation methods. We also present KOSCamo+, the first freehand sketch dataset for camouflaged object detection.
arXiv Detail & Related papers (2025-01-31T17:26:40Z)
SketchYourSeg: Mask-Free Subjective Image Segmentation via Freehand Sketches [116.1810651297801]
SketchYourSeg establishes freehand sketches as a powerful query modality for subjective image segmentation. Our evaluations demonstrate superior performance over existing approaches across diverse benchmarks.
arXiv Detail & Related papers (2025-01-27T13:07:51Z)
Sketch-guided Image Inpainting with Partial Discrete Diffusion Process [5.005162730122933]
We introduce a novel partial discrete diffusion process (PDDP) for sketch-guided inpainting. PDDP corrupts the masked regions of the image and reconstructs these masked regions conditioned on hand-drawn sketches. The proposed novel transformer module accepts two inputs -- the image containing the masked region to be inpainted and the query sketch to model the reverse diffusion process.
arXiv Detail & Related papers (2024-04-18T07:07:38Z)
CustomSketching: Sketch Concept Extraction for Sketch-based Image Synthesis and Editing [21.12815542848095]
Personalization techniques for large text-to-image (T2I) models allow users to incorporate new concepts from reference images. Existing methods primarily rely on textual descriptions, leading to limited control over customized images. We identify sketches as an intuitive and versatile representation that can facilitate such control.
arXiv Detail & Related papers (2024-02-27T15:52:59Z)
SceneComposer: Any-Level Semantic Image Synthesis [80.55876413285587]
We propose a new framework for conditional image synthesis from semantic layouts of any precision levels. The framework naturally reduces to text-to-image (T2I) at the lowest level with no shape information, and it becomes segmentation-to-image (S2I) at the highest level. We introduce several novel techniques to address the challenges coming with this new setup.
arXiv Detail & Related papers (2022-11-21T18:59:05Z)
Learning to Annotate Part Segmentation with Gradient Matching [58.100715754135685]
This paper focuses on tackling semi-supervised part segmentation tasks by generating high-quality images with a pre-trained GAN. In particular, we formulate the annotator learning as a learning-to-learn problem. We show that our method can learn annotators from a broad range of labelled images including real images, generated images, and even analytically rendered images.
arXiv Detail & Related papers (2022-11-06T01:29:22Z)
Image Aesthetics Assessment Using Graph Attention Network [17.277954886018353]
We present a two-stage framework based on graph neural networks for image aesthetics assessment. First, we propose a feature-graph representation in which the input image is modelled as a graph, maintaining its original aspect ratio and resolution. Second, we propose a graph neural network architecture that takes this feature-graph and captures the semantic relationship between the different regions of the input image using visual attention.
arXiv Detail & Related papers (2022-06-26T12:52:46Z)
Adaptive Image Inpainting [43.02281823557039]
Inpainting methods have shown significant improvements by using deep neural networks. The problem is rooted in the encoder layers' ineffectiveness in building a complete and faithful embedding of the missing regions. We propose a distillation based approach for inpainting, where we provide direct feature level supervision for the encoder layers.
arXiv Detail & Related papers (2022-01-01T12:16:01Z)
SketchEdit: Mask-Free Local Image Manipulation with Partial Sketches [95.45728042499836]
We propose a new paradigm of sketch-based image manipulation: mask-free local image manipulation. Our model automatically predicts the target modification region and encodes it into a structure style vector. A generator then synthesizes the new image content based on the style vector and sketch.
arXiv Detail & Related papers (2021-11-30T02:42:31Z)
Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps [85.67745220834718]
We present an edge-guided learnable bidirectional attention map (Edge-LBAM) for improving image inpainting of irregular holes. Our Edge-LBAM method contains dual procedures,including structure-aware mask-updating guided by predict edges. Extensive experiments show that our Edge-LBAM is effective in generating coherent image structures and preventing color discrepancy and blurriness.
arXiv Detail & Related papers (2021-04-25T07:25:16Z)
Semantic Layout Manipulation with High-Resolution Sparse Attention [106.59650698907953]
We tackle the problem of semantic image layout manipulation, which aims to manipulate an input image by editing its semantic label map. A core problem of this task is how to transfer visual details from the input images to the new semantic layout while making the resulting image visually realistic. We propose a high-resolution sparse attention module that effectively transfers visual details to new layouts at a resolution up to 512x512.
arXiv Detail & Related papers (2020-12-14T06:50:43Z)
Deep Generation of Face Images from Sketches [36.146494762987146]
Deep image-to-image translation techniques allow fast generation of face images from freehand sketches. Existing solutions tend to overfit to sketches, thus requiring professional sketches or even edge maps as input. We propose to implicitly model the shape space of plausible face images and synthesize a face image in this space to approximate an input sketch. Our method essentially uses input sketches as soft constraints and is thus able to produce high-quality face images even from rough and/or incomplete sketches.
arXiv Detail & Related papers (2020-06-01T16:20:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.