A Zero-Shot Sketch-based Inter-Modal Object Retrieval Scheme for Remote
Sensing Images
- URL: http://arxiv.org/abs/2008.05225v1
- Date: Wed, 12 Aug 2020 10:51:24 GMT
- Title: A Zero-Shot Sketch-based Inter-Modal Object Retrieval Scheme for Remote
Sensing Images
- Authors: Ushasi Chaudhuri, Biplab Banerjee, Avik Bhattacharya, Mihai Datcu
- Abstract summary: We propose a novel inter-modal triplet-based zero-shot retrieval scheme utilizing a sketch-based representation of RS data.
The proposed scheme performs efficiently even when the sketch representations are marginally prototypical of the image.
- Score: 26.48516754642218
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conventional existing retrieval methods in remote sensing (RS) are often
based on a uni-modal data retrieval framework. In this work, we propose a novel
inter-modal triplet-based zero-shot retrieval scheme utilizing a sketch-based
representation of RS data. The proposed scheme performs efficiently even when
the sketch representations are marginally prototypical of the image. We
conducted experiments on a new bi-modal image-sketch dataset called Earth on
Canvas (EoC) conceived during this study. We perform a thorough bench-marking
of this dataset and demonstrate that the proposed network outperforms other
state-of-the-art methods for zero-shot sketch-based retrieval framework in
remote sensing.
Related papers
- Modality-Aware Representation Learning for Zero-shot Sketch-based Image
Retrieval [10.568851068989973]
Zero-shot learning offers an efficient solution for a machine learning model to treat unseen categories.
We propose a novel framework that indirectly aligns sketches and photos by contrasting them through texts.
With an explicit modality encoding learned from data, our approach disentangles modality-agnostic semantics from modality-specific information.
arXiv Detail & Related papers (2024-01-10T00:39:03Z) - Symmetrical Bidirectional Knowledge Alignment for Zero-Shot Sketch-Based
Image Retrieval [69.46139774646308]
This paper studies the problem of zero-shot sketch-based image retrieval (ZS-SBIR)
It aims to use sketches from unseen categories as queries to match the images of the same category.
We propose a novel Symmetrical Bidirectional Knowledge Alignment for zero-shot sketch-based image retrieval (SBKA)
arXiv Detail & Related papers (2023-12-16T04:50:34Z) - Zero-shot Composed Text-Image Retrieval [72.43790281036584]
We consider the problem of composed image retrieval (CIR)
It aims to train a model that can fuse multi-modal information, e.g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.
arXiv Detail & Related papers (2023-06-12T17:56:01Z) - Sketch-an-Anchor: Sub-epoch Fast Model Adaptation for Zero-shot
Sketch-based Image Retrieval [1.52292571922932]
Sketch-an-Anchor is a novel method to train state-of-the-art Zero-shot Image Retrieval (ZSSBIR) models in under an epoch.
Our fast-converging model keeps the single-domain performance while learning to extract similar representations from sketches.
arXiv Detail & Related papers (2023-03-29T15:00:02Z) - BDA-SketRet: Bi-Level Domain Adaptation for Zero-Shot SBIR [52.78253400327191]
BDA-SketRet is a novel framework performing a bi-level domain adaptation for aligning the spatial and semantic features of the visual data pairs.
Experimental results on the extended Sketchy, TU-Berlin, and QuickDraw exhibit sharp improvements over the literature.
arXiv Detail & Related papers (2022-01-17T18:45:55Z) - Domain-Smoothing Network for Zero-Shot Sketch-Based Image Retrieval [66.37346493506737]
Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a novel cross-modal retrieval task.
We propose a novel Domain-Smoothing Network (DSN) for ZS-SBIR.
Our approach notably outperforms the state-of-the-art methods in both Sketchy and TU-Berlin datasets.
arXiv Detail & Related papers (2021-06-22T14:58:08Z) - Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD
Images [69.5662419067878]
Grounding referring expressions in RGBD image has been an emerging field.
We present a novel task of 3D visual grounding in single-view RGBD image where the referred objects are often only partially scanned due to occlusion.
Our approach first fuses the language and the visual features at the bottom level to generate a heatmap that localizes the relevant regions in the RGBD image.
Then our approach conducts an adaptive feature learning based on the heatmap and performs the object-level matching with another visio-linguistic fusion to finally ground the referred object.
arXiv Detail & Related papers (2021-03-14T11:18:50Z) - Semantically Tied Paired Cycle Consistency for Any-Shot Sketch-based
Image Retrieval [55.29233996427243]
Low-shot sketch-based image retrieval is an emerging task in computer vision.
In this paper, we address any-shot, i.e. zero-shot and few-shot, sketch-based image retrieval (SBIR) tasks.
For solving these tasks, we propose a semantically aligned cycle-consistent generative adversarial network (SEM-PCYC)
Our results demonstrate a significant boost in any-shot performance over the state-of-the-art on the extended version of the Sketchy, TU-Berlin and QuickDraw datasets.
arXiv Detail & Related papers (2020-06-20T22:43:53Z) - Image Retrieval using Multi-scale CNN Features Pooling [26.811290793232313]
We present an end-to-end trainable network architecture that exploits a novel multi-scale local pooling based on NetVLAD and a triplet mining procedure based on samples difficulty to obtain an effective image representation.
arXiv Detail & Related papers (2020-04-21T00:57:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.