Scene Designer: a Unified Model for Scene Search and Synthesis from
Sketch
- URL: http://arxiv.org/abs/2108.07353v1
- Date: Mon, 16 Aug 2021 21:40:16 GMT
- Title: Scene Designer: a Unified Model for Scene Search and Synthesis from
Sketch
- Authors: Leo Sampaio Ferraz Ribeiro and Tu Bui and John Collomosse and Moacir
Ponti
- Abstract summary: Scene Designer is a novel method for searching and generating images using free-hand sketches of scene compositions.
Our core contribution is a single unified model to learn both a cross-modal search embedding for matching sketched compositions to images, and an object embedding for layout synthesis.
- Score: 7.719705312172286
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scene Designer is a novel method for searching and generating images using
free-hand sketches of scene compositions; i.e. drawings that describe both the
appearance and relative positions of objects. Our core contribution is a single
unified model to learn both a cross-modal search embedding for matching
sketched compositions to images, and an object embedding for layout synthesis.
We show that a graph neural network (GNN) followed by Transformer under our
novel contrastive learning setting is required to allow learning correlations
between object type, appearance and arrangement, driving a mask generation
module that synthesises coherent scene layouts, whilst also delivering state of
the art sketch based visual search of scenes.
Related papers
- Sketch-Guided Scene Image Generation [11.009579131371018]
We propose a sketch-guided scene image generation framework, decomposing the task of scene image scene generation from sketch inputs.
We employ pre-trained diffusion models to convert each single object drawing into an image of the object, inferring additional details while maintaining the sparse sketch structure.
In scene-level image construction, we generate the latent representation of the scene image using the separated background prompts.
arXiv Detail & Related papers (2024-07-09T00:16:45Z) - SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation
for Novel View Synthesis from a Single Image [60.52991173059486]
We introduce SAMPLING, a Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image.
Our method demonstrates considerable performance gains in large-scale unbounded outdoor scenes using a single image on the KITTI dataset.
arXiv Detail & Related papers (2023-09-12T15:33:09Z) - DiffSketching: Sketch Control Image Synthesis with Diffusion Models [10.172753521953386]
Deep learning models for sketch-to-image synthesis need to overcome the distorted input sketch without visual details.
Our model matches sketches through the cross domain constraints, and uses a classifier to guide the image synthesis more accurately.
Our model can beat GAN-based method in terms of generation quality and human evaluation, and does not rely on massive sketch-image datasets.
arXiv Detail & Related papers (2023-05-30T07:59:23Z) - Text-Guided Scene Sketch-to-Photo Synthesis [5.431298869139175]
We propose a method for scene-level sketch-to-photo synthesis with text guidance.
To train our model, we use self-supervised learning from a set of photographs.
Experiments show that the proposed method translates original sketch images that are not extracted from color images into photos with compelling visual quality.
arXiv Detail & Related papers (2023-02-14T08:13:36Z) - Unsupervised Scene Sketch to Photo Synthesis [40.044690369936184]
We present a method for synthesizing realistic photos from scene sketches.
Our framework learns from readily available large-scale photo datasets in an unsupervised manner.
We also demonstrate that our framework facilitates a controllable manipulation of photo synthesis by editing strokes of corresponding sketches.
arXiv Detail & Related papers (2022-09-06T22:25:06Z) - FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in
Context [112.07988211268612]
We advance sketch research to scenes with the first dataset of freehand scene sketches, FS-COCO.
Our dataset comprises 10,000 freehand scene vector sketches with per point space-time information by 100 non-expert individuals.
We study for the first time the problem of the fine-grained image retrieval from freehand scene sketches and sketch captions.
arXiv Detail & Related papers (2022-03-04T03:00:51Z) - Compositional Sketch Search [91.84489055347585]
We present an algorithm for searching image collections using free-hand sketches.
We exploit drawings as a concise and intuitive representation for specifying entire scene compositions.
arXiv Detail & Related papers (2021-06-15T09:38:09Z) - Neural Scene Graphs for Dynamic Scenes [57.65413768984925]
We present the first neural rendering method that decomposes dynamic scenes into scene graphs.
We learn implicitly encoded scenes, combined with a jointly learned latent representation to describe objects with a single implicit function.
arXiv Detail & Related papers (2020-11-20T12:37:10Z) - SketchEmbedNet: Learning Novel Concepts by Imitating Drawings [125.45799722437478]
We explore properties of image representations learned by training a model to produce sketches of images.
We show that this generative, class-agnostic model produces informative embeddings of images from novel examples, classes, and even novel datasets in a few-shot setting.
arXiv Detail & Related papers (2020-08-27T16:43:28Z) - SketchyCOCO: Image Generation from Freehand Scene Sketches [71.85577739612579]
We introduce the first method for automatic image generation from scene-level freehand sketches.
Key contribution is an attribute vector bridged Geneversarative Adrial Network called EdgeGAN.
We have built a large-scale composite dataset called SketchyCOCO to support and evaluate the solution.
arXiv Detail & Related papers (2020-03-05T14:54:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.