Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch
- URL: http://arxiv.org/abs/2505.23763v1
- Date: Thu, 29 May 2025 17:59:51 GMT
- Title: Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch
- Authors: Aneeshan Sain, Subhajit Maity, Pinaki Nath Chowdhury, Subhadeep Koley, Ayan Kumar Bhunia, Yi-Zhe Song,
- Abstract summary: There is no research on the efficient inference specifically designed for sketch data.<n>We first demonstrate existing state-of-the-art efficient light-weight models designed for photos do not work on sketches.<n>We then propose two sketch-specific components which work in a plug-n-play manner on any photo efficient network to adapt them to work on sketch data.
- Score: 80.90808879991182
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: As sketch research has collectively matured over time, its adaptation for at-mass commercialisation emerges on the immediate horizon. Despite an already mature research endeavour for photos, there is no research on the efficient inference specifically designed for sketch data. In this paper, we first demonstrate existing state-of-the-art efficient light-weight models designed for photos do not work on sketches. We then propose two sketch-specific components which work in a plug-n-play manner on any photo efficient network to adapt them to work on sketch data. We specifically chose fine-grained sketch-based image retrieval (FG-SBIR) as a demonstrator as the most recognised sketch problem with immediate commercial value. Technically speaking, we first propose a cross-modal knowledge distillation network to transfer existing photo efficient networks to be compatible with sketch, which brings down number of FLOPs and model parameters by 97.96% percent and 84.89% respectively. We then exploit the abstract trait of sketch to introduce a RL-based canvas selector that dynamically adjusts to the abstraction level which further cuts down number of FLOPs by two thirds. The end result is an overall reduction of 99.37% of FLOPs (from 40.18G to 0.254G) when compared with a full network, while retaining the accuracy (33.03% vs 32.77%) -- finally making an efficient network for the sparse sketch data that exhibit even fewer FLOPs than the best photo counterpart.
Related papers
- One-shot Face Sketch Synthesis in the Wild via Generative Diffusion Prior and Instruction Tuning [52.0161291920299]
Face sketch synthesis is a technique aimed at converting face photos into sketches.<n>Existing face sketch synthesis research mainly relies on training with numerous photo-sketch sample pairs from existing datasets.<n>We propose a one-shot face sketch synthesis method based on diffusion models.
arXiv Detail & Related papers (2025-06-18T09:41:30Z) - Active Learning for Fine-Grained Sketch-Based Image Retrieval [1.994307489466967]
The ability to retrieve a photo by mere free-hand sketching highlights the immense potential of Fine-grained sketch-based image retrieval (FG-SBIR)
We propose a novel active learning sampling technique that drastically minimises the need for drawing photo sketches.
arXiv Detail & Related papers (2023-09-15T20:07:14Z) - A Recipe for Efficient SBIR Models: Combining Relative Triplet Loss with
Batch Normalization and Knowledge Distillation [3.364554138758565]
Sketch-Based Image Retrieval (SBIR) is a crucial task in multimedia retrieval, where the goal is to retrieve a set of images that match a given sketch query.
We introduce a Relative Triplet Loss (RTL), an adapted triplet loss to overcome limitations through loss weighting based on anchors similarity.
We propose a straightforward approach to train small models efficiently with a marginal loss of accuracy through knowledge distillation.
arXiv Detail & Related papers (2023-05-30T12:41:04Z) - CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained
or Not [109.69076457732632]
We leverage CLIP for zero-shot sketch based image retrieval (ZS-SBIR)
We put forward novel designs on how best to achieve this synergy.
We observe significant performance gains in the region of 26.9% over previous state-of-the-art.
arXiv Detail & Related papers (2023-03-23T17:02:00Z) - Picture that Sketch: Photorealistic Image Generation from Abstract
Sketches [109.69076457732632]
Given an abstract, deformed, ordinary sketch from untrained amateurs like you and me, this paper turns it into a photorealistic image.
We do not dictate an edgemap-like sketch to start with, but aim to work with abstract free-hand human sketches.
In doing so, we essentially democratise the sketch-to-photo pipeline, "picturing" a sketch regardless of how good you sketch.
arXiv Detail & Related papers (2023-03-20T14:49:03Z) - Multi-granularity Association Learning Framework for on-the-fly
Fine-Grained Sketch-based Image Retrieval [7.797006835701767]
Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo in a given query sketch.
In this study, we aim to retrieve the target photo with the least number of strokes possible (incomplete sketch)
We propose a multi-granularity association learning framework that further optimize the embedding space of all incomplete sketches.
arXiv Detail & Related papers (2022-01-13T14:38:50Z) - Deep Facial Synthesis: A New Challenge [75.99659340231078]
We first introduce a high-quality dataset for FSS, named FS2K, which consists of 2,104 image-sketch pairs.
Second, we present the largest-scale FSS study by investigating 139 classical methods.
Third, we present a simple baseline for FSS, named FSGAN.
arXiv Detail & Related papers (2021-12-31T13:19:21Z) - Road Segmentation for Remote Sensing Images using Adversarial Spatial
Pyramid Networks [28.32775611169636]
We introduce a new model to apply structured domain adaption for synthetic image generation and road segmentation.
A novel scale-wise architecture is introduced to learn from the multi-level feature maps and improve the semantics of the features.
Our model achieves state-of-the-art 78.86 IOU on the Massachusetts dataset with 14.89M parameters and 86.78B FLOPs, with 4x fewer FLOPs but higher accuracy (+3.47% IOU)
arXiv Detail & Related papers (2020-08-10T11:00:19Z) - Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image
Retrieval [203.2520862597357]
Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo instance given a user's query sketch.
We reformulate the conventional FG-SBIR framework to tackle these challenges.
We propose an on-the-fly design that starts retrieving as soon as the user starts drawing.
arXiv Detail & Related papers (2020-02-24T15:36:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.