DoodleFormer: Creative Sketch Drawing with Transformers
- URL: http://arxiv.org/abs/2112.03258v1
- Date: Mon, 6 Dec 2021 18:59:59 GMT
- Title: DoodleFormer: Creative Sketch Drawing with Transformers
- Authors: Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer,
Fahad Shahbaz Khan, Jorma Laaksonen, Michael Felsberg
- Abstract summary: Creative sketching or doodling is an expressive activity, where imaginative and previously unseen depictions of everyday visual objects are drawn.
Here, we propose a novel coarse-to-fine two-stage framework, DoodleFormer, that decomposes the creative sketch generation problem into the creation of coarse sketch composition.
To ensure diversity of the generated creative sketches, we introduce a probabilistic coarse sketch decoder.
- Score: 68.18953603715514
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Creative sketching or doodling is an expressive activity, where imaginative
and previously unseen depictions of everyday visual objects are drawn. Creative
sketch image generation is a challenging vision problem, where the task is to
generate diverse, yet realistic creative sketches possessing the unseen
composition of the visual-world objects. Here, we propose a novel
coarse-to-fine two-stage framework, DoodleFormer, that decomposes the creative
sketch generation problem into the creation of coarse sketch composition
followed by the incorporation of fine-details in the sketch. We introduce
graph-aware transformer encoders that effectively capture global dynamic as
well as local static structural relations among different body parts. To ensure
diversity of the generated creative sketches, we introduce a probabilistic
coarse sketch decoder that explicitly models the variations of each sketch body
part to be drawn. Experiments are performed on two creative sketch datasets:
Creative Birds and Creative Creatures. Our qualitative, quantitative and
human-based evaluations show that DoodleFormer outperforms the state-of-the-art
on both datasets, yielding realistic and diverse creative sketches. On Creative
Creatures, DoodleFormer achieves an absolute gain of 25 in terms of Fr`echet
inception distance (FID) over the state-of-the-art. We also demonstrate the
effectiveness of DoodleFormer for related applications of text to creative
sketch generation and sketch completion.
Related papers
- SketchTriplet: Self-Supervised Scenarized Sketch-Text-Image Triplet Generation [6.39528707908268]
There continues to be a lack of large-scale paired datasets for scene sketches.
We propose a self-supervised method for scene sketch generation that does not rely on any existing scene sketch.
We contribute a large-scale dataset centered around scene sketches, comprising highly semantically consistent "text-sketch-image" triplets.
arXiv Detail & Related papers (2024-05-29T06:43:49Z) - CreativeSynth: Creative Blending and Synthesis of Visual Arts based on
Multimodal Diffusion [74.44273919041912]
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.
However, adapting these models for artistic image editing presents two significant challenges.
We build the innovative unified framework Creative Synth, which is based on a diffusion model with the ability to coordinate multimodal inputs.
arXiv Detail & Related papers (2024-01-25T10:42:09Z) - SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation [111.2195741547517]
We present a method to generate controlled sketches using a text-conditioned diffusion model trained on pixel representations of images.
Our objective is to empower non-professional users to create sketches and, through a series of optimisation processes, transform a narrative into a storyboard.
arXiv Detail & Related papers (2023-08-27T19:44:44Z) - Picture that Sketch: Photorealistic Image Generation from Abstract
Sketches [109.69076457732632]
Given an abstract, deformed, ordinary sketch from untrained amateurs like you and me, this paper turns it into a photorealistic image.
We do not dictate an edgemap-like sketch to start with, but aim to work with abstract free-hand human sketches.
In doing so, we essentially democratise the sketch-to-photo pipeline, "picturing" a sketch regardless of how good you sketch.
arXiv Detail & Related papers (2023-03-20T14:49:03Z) - Towards Practicality of Sketch-Based Visual Understanding [15.30818342202786]
Sketches have been used to conceptualise and depict visual objects from pre-historic times.
This thesis aims to progress sketch-based visual understanding towards more practicality.
arXiv Detail & Related papers (2022-10-27T03:12:57Z) - DeepPortraitDrawing: Generating Human Body Images from Freehand Sketches [75.4318318890065]
We present DeepDrawing, a framework for converting roughly drawn sketches to realistic human body images.
To encode complicated body shapes under various poses, we take a local-to-global approach.
Our method produces more realistic images than the state-of-the-art sketch-to-image synthesis techniques.
arXiv Detail & Related papers (2022-05-04T14:02:45Z) - Creative Sketch Generation [48.16835161875747]
We introduce two datasets of creative sketches -- Creative Birds and Creative Creatures -- containing 10k sketches each along with part annotations.
We propose DoodlerGAN -- a part-based Generative Adrial Network (GAN) -- to generate unseen compositions of novel part appearances.
Quantitative evaluations as well as human studies demonstrate that sketches generated by our approach are more creative and of higher quality than existing approaches.
arXiv Detail & Related papers (2020-11-19T18:57:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.