CBIR using features derived by Deep Learning
- URL: http://arxiv.org/abs/2002.07877v1
- Date: Thu, 13 Feb 2020 21:26:32 GMT
- Title: CBIR using features derived by Deep Learning
- Authors: Subhadip Maji and Smarajit Bose
- Abstract summary: In a Content Based Image Retrieval (CBIR) System, the task is to retrieve similar images from a large database given a query image.
We propose to use features derived from pre-trained network models from a deep-learning convolution network trained for a large image classification problem.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In a Content Based Image Retrieval (CBIR) System, the task is to retrieve
similar images from a large database given a query image. The usual procedure
is to extract some useful features from the query image, and retrieve images
which have similar set of features. For this purpose, a suitable similarity
measure is chosen, and images with high similarity scores are retrieved.
Naturally the choice of these features play a very important role in the
success of this system, and high level features are required to reduce the
semantic gap.
In this paper, we propose to use features derived from pre-trained network
models from a deep-learning convolution network trained for a large image
classification problem. This approach appears to produce vastly superior
results for a variety of databases, and it outperforms many contemporary CBIR
systems. We analyse the retrieval time of the method, and also propose a
pre-clustering of the database based on the above-mentioned features which
yields comparable results in a much shorter time in most of the cases.
Related papers
- Advancing Image Retrieval with Few-Shot Learning and Relevance Feedback [5.770351255180495]
Image Retrieval with Relevance Feedback (IRRF) involves iterative human interaction during the retrieval process.
We propose a new scheme based on a hyper-network, that is tailored to the task and facilitates swift adjustment to user feedback.
We show that our method can attain SoTA results in few-shot one-class classification and reach comparable results in binary classification task of few-shot open-set recognition.
arXiv Detail & Related papers (2023-12-18T10:20:28Z) - Object-Centric Open-Vocabulary Image-Retrieval with Aggregated Features [12.14013374452918]
We present a simple yet effective approach to object-centric open-vocabulary image retrieval.
Our approach aggregates dense embeddings extracted from CLIP into a compact representation.
We show the effectiveness of our scheme to the task by achieving significantly better results than global feature approaches on three datasets.
arXiv Detail & Related papers (2023-09-26T15:13:09Z) - Integrating Visual and Semantic Similarity Using Hierarchies for Image
Retrieval [0.46040036610482665]
We propose a method for CBIR that captures both visual and semantic similarity using a visual hierarchy.
The hierarchy is constructed by merging classes with overlapping features in the latent space of a deep neural network trained for classification.
Our method achieves superior performance compared to the existing methods on image retrieval.
arXiv Detail & Related papers (2023-08-16T15:23:14Z) - Progressive Learning for Image Retrieval with Hybrid-Modality Queries [48.79599320198615]
Image retrieval with hybrid-modality queries, also known as composing text and image for image retrieval (CTI-IR)
We decompose the CTI-IR task into a three-stage learning problem to progressively learn the complex knowledge for image retrieval with hybrid-modality queries.
Our proposed model significantly outperforms state-of-the-art methods in the mean of Recall@K by 24.9% and 9.5% on the Fashion-IQ and Shoes benchmark datasets respectively.
arXiv Detail & Related papers (2022-04-24T08:10:06Z) - Cross-Modality Sub-Image Retrieval using Contrastive Multimodal Image
Representations [3.3754780158324564]
Cross-modality image retrieval is challenging, since images of similar (or even the same) content captured by different modalities might share few common structures.
We propose a new application-independent content-based image retrieval system for reverse (sub-)image search across modalities.
arXiv Detail & Related papers (2022-01-10T19:04:28Z) - GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval [2.421459418045937]
We show that large-scale pretraining significantly improves retrieval performance and present experiments on how to further increase these properties by appropriate fine-tuning.
With these promising results, we hope to increase interest in the research topic of general-purpose CBIR.
arXiv Detail & Related papers (2021-11-25T15:19:21Z) - Contextual Similarity Aggregation with Self-attention for Visual
Re-ranking [96.55393026011811]
We propose a visual re-ranking method by contextual similarity aggregation with self-attention.
We conduct comprehensive experiments on four benchmark datasets to demonstrate the generality and effectiveness of our proposed visual re-ranking method.
arXiv Detail & Related papers (2021-10-26T06:20:31Z) - Cross-Modal Retrieval Augmentation for Multi-Modal Classification [61.5253261560224]
We explore the use of unstructured external knowledge sources of images and their corresponding captions for improving visual question answering.
First, we train a novel alignment model for embedding images and captions in the same space, which achieves substantial improvement on image-caption retrieval.
Second, we show that retrieval-augmented multi-modal transformers using the trained alignment model improve results on VQA over strong baselines.
arXiv Detail & Related papers (2021-04-16T13:27:45Z) - Saliency-driven Class Impressions for Feature Visualization of Deep
Neural Networks [55.11806035788036]
It is advantageous to visualize the features considered to be essential for classification.
Existing visualization methods develop high confidence images consisting of both background and foreground features.
In this work, we propose a saliency-driven approach to visualize discriminative features that are considered most important for a given task.
arXiv Detail & Related papers (2020-07-31T06:11:06Z) - Geometrically Mappable Image Features [85.81073893916414]
Vision-based localization of an agent in a map is an important problem in robotics and computer vision.
We propose a method that learns image features targeted for image-retrieval-based localization.
arXiv Detail & Related papers (2020-03-21T15:36:38Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.