Do Generalised Classifiers really work on Human Drawn Sketches?
- URL: http://arxiv.org/abs/2407.03893v1
- Date: Thu, 4 Jul 2024 12:37:08 GMT
- Title: Do Generalised Classifiers really work on Human Drawn Sketches?
- Authors: Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Ayan Kumar Bhunia, Yi-Zhe Song,
- Abstract summary: This paper marries large foundation models with human sketch understanding.
We demonstrate what this brings -- a paradigm shift in terms of generalised sketch representation learning.
Our framework surpasses popular sketch representation learning algorithms in both zero-shot and few-shot setups.
- Score: 122.11670266648771
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper, for the first time, marries large foundation models with human sketch understanding. We demonstrate what this brings -- a paradigm shift in terms of generalised sketch representation learning (e.g., classification). This generalisation happens on two fronts: (i) generalisation across unknown categories (i.e., open-set), and (ii) generalisation traversing abstraction levels (i.e., good and bad sketches), both being timely challenges that remain unsolved in the sketch literature. Our design is intuitive and centred around transferring the already stellar generalisation ability of CLIP to benefit generalised learning for sketches. We first "condition" the vanilla CLIP model by learning sketch-specific prompts using a novel auxiliary head of raster to vector sketch conversion. This importantly makes CLIP "sketch-aware". We then make CLIP acute to the inherently different sketch abstraction levels. This is achieved by learning a codebook of abstraction-specific prompt biases, a weighted combination of which facilitates the representation of sketches across abstraction levels -- low abstract edge-maps, medium abstract sketches in TU-Berlin, and highly abstract doodles in QuickDraw. Our framework surpasses popular sketch representation learning algorithms in both zero-shot and few-shot setups and in novel settings across different abstraction boundaries.
Related papers
- Picture that Sketch: Photorealistic Image Generation from Abstract
Sketches [109.69076457732632]
Given an abstract, deformed, ordinary sketch from untrained amateurs like you and me, this paper turns it into a photorealistic image.
We do not dictate an edgemap-like sketch to start with, but aim to work with abstract free-hand human sketches.
In doing so, we essentially democratise the sketch-to-photo pipeline, "picturing" a sketch regardless of how good you sketch.
arXiv Detail & Related papers (2023-03-20T14:49:03Z) - CLIPascene: Scene Sketching with Different Types and Levels of
Abstraction [48.30702300230904]
We present a method for converting a given scene image into a sketch using different types and multiple levels of abstraction.
The first considers the fidelity of the sketch, varying its representation from a more precise portrayal of the input to a looser depiction.
The second is defined by the visual simplicity of the sketch, moving from a detailed depiction to a sparse sketch.
arXiv Detail & Related papers (2022-11-30T18:54:32Z) - I Know What You Draw: Learning Grasp Detection Conditioned on a Few
Freehand Sketches [74.63313641583602]
We propose a method to generate a potential grasp configuration relevant to the sketch-depicted objects.
Our model is trained and tested in an end-to-end manner which is easy to be implemented in real-world applications.
arXiv Detail & Related papers (2022-05-09T04:23:36Z) - CLIPasso: Semantically-Aware Object Sketching [34.53644912236454]
We present an object sketching method that can achieve different levels of abstraction, guided by geometric and semantic simplifications.
We define a sketch as a set of B'ezier curves and use a differentiizer to optimize the parameters of the curves directly with respect to a CLIP-based perceptual loss.
arXiv Detail & Related papers (2022-02-11T18:35:25Z) - Multi-granularity Association Learning Framework for on-the-fly
Fine-Grained Sketch-based Image Retrieval [7.797006835701767]
Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo in a given query sketch.
In this study, we aim to retrieve the target photo with the least number of strokes possible (incomplete sketch)
We propose a multi-granularity association learning framework that further optimize the embedding space of all incomplete sketches.
arXiv Detail & Related papers (2022-01-13T14:38:50Z) - Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image
Retrieval [147.24102408745247]
We study a further trait of sketches that has been overlooked to date, that is, they are hierarchical in terms of the levels of detail.
In this paper, we design a novel network that is capable of cultivating sketch-specific hierarchies and exploiting them to match sketch with photo at corresponding hierarchical levels.
arXiv Detail & Related papers (2020-07-29T20:50:25Z) - Sketch-BERT: Learning Sketch Bidirectional Encoder Representation from
Transformers by Self-supervised Learning of Sketch Gestalt [125.17887147597567]
We present a model of learning Sketch BiBERT Representation from Transformer (Sketch-BERT)
We generalize BERT to sketch domain, with the novel proposed components and pre-training algorithms.
We show that the learned representation of Sketch-BERT can help and improve the performance of the downstream tasks of sketch recognition, sketch retrieval, and sketch gestalt.
arXiv Detail & Related papers (2020-05-19T01:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.