Compositional Sketch Search
- URL: http://arxiv.org/abs/2106.08009v1
- Date: Tue, 15 Jun 2021 09:38:09 GMT
- Title: Compositional Sketch Search
- Authors: Alexander Black, Tu Bui, Long Mai, Hailin Jin, John Collomosse
- Abstract summary: We present an algorithm for searching image collections using free-hand sketches.
We exploit drawings as a concise and intuitive representation for specifying entire scene compositions.
- Score: 91.84489055347585
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an algorithm for searching image collections using free-hand
sketches that describe the appearance and relative positions of multiple
objects. Sketch based image retrieval (SBIR) methods predominantly match
queries containing a single, dominant object invariant to its position within
an image. Our work exploits drawings as a concise and intuitive representation
for specifying entire scene compositions. We train a convolutional neural
network (CNN) to encode masked visual features from sketched objects, pooling
these into a spatial descriptor encoding the spatial relationships and
appearances of objects in the composition. Training the CNN backbone as a
Siamese network under triplet loss yields a metric search embedding for
measuring compositional similarity which may be efficiently leveraged for
visual search by applying product quantization.
Related papers
- PairingNet: A Learning-based Pair-searching and -matching Network for
Image Fragments [6.694162736590122]
We propose a learning-based image fragment pair-searching and -matching approach to solve the challenging restoration problem.
Our proposed network achieves excellent pair-searching accuracy, reduces matching errors, and significantly reduces computational time.
arXiv Detail & Related papers (2023-12-14T07:43:53Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - Using Text to Teach Image Retrieval [47.72498265721957]
We build on the concept of image manifold to represent the feature space of images, learned via neural networks, as a graph.
We augment the manifold samples with geometrically aligned text, thereby using a plethora of sentences to teach us about images.
The experimental results show that the joint embedding manifold is a robust representation, allowing it to be a better basis to perform image retrieval.
arXiv Detail & Related papers (2020-11-19T16:09:14Z) - LCD -- Line Clustering and Description for Place Recognition [29.053923938306323]
We introduce a novel learning-based approach to place recognition, using RGB-D cameras and line clusters as visual and geometric features.
We present a neural network architecture based on the attention mechanism for frame-wise line clustering.
A similar neural network is used for the description of these clusters with a compact embedding of 128 floating point numbers.
arXiv Detail & Related papers (2020-10-21T09:52:47Z) - Semantically Tied Paired Cycle Consistency for Any-Shot Sketch-based
Image Retrieval [55.29233996427243]
Low-shot sketch-based image retrieval is an emerging task in computer vision.
In this paper, we address any-shot, i.e. zero-shot and few-shot, sketch-based image retrieval (SBIR) tasks.
For solving these tasks, we propose a semantically aligned cycle-consistent generative adversarial network (SEM-PCYC)
Our results demonstrate a significant boost in any-shot performance over the state-of-the-art on the extended version of the Sketchy, TU-Berlin and QuickDraw datasets.
arXiv Detail & Related papers (2020-06-20T22:43:53Z) - Expressing Objects just like Words: Recurrent Visual Embedding for
Image-Text Matching [102.62343739435289]
Existing image-text matching approaches infer the similarity of an image-text pair by capturing and aggregating the affinities between the text and each independent object of the image.
We propose a Dual Path Recurrent Neural Network (DP-RNN) which processes images and sentences symmetrically by recurrent neural networks (RNN)
Our model achieves the state-of-the-art performance on Flickr30K dataset and competitive performance on MS-COCO dataset.
arXiv Detail & Related papers (2020-02-20T00:51:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.