Self-supervised Multi-view Disentanglement for Expansion of Visual
Collections
- URL: http://arxiv.org/abs/2302.02249v1
- Date: Sat, 4 Feb 2023 22:09:17 GMT
- Title: Self-supervised Multi-view Disentanglement for Expansion of Visual
Collections
- Authors: Nihal Jain, Praneetha Vaddamanu, Paridhi Maheshwari, Vishwa Vinay,
Kuldeep Kulkarni
- Abstract summary: We consider the setting where a query for similar images is derived from a collection of images.
For visual search, the similarity measurements may be made along multiple axes, or views, such as style and color.
Our objective is to design a retrieval algorithm that effectively combines similarities computed over representations from multiple views.
- Score: 6.944742823561
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image search engines enable the retrieval of images relevant to a query
image. In this work, we consider the setting where a query for similar images
is derived from a collection of images. For visual search, the similarity
measurements may be made along multiple axes, or views, such as style and
color. We assume access to a set of feature extractors, each of which computes
representations for a specific view. Our objective is to design a retrieval
algorithm that effectively combines similarities computed over representations
from multiple views. To this end, we propose a self-supervised learning method
for extracting disentangled view-specific representations for images such that
the inter-view overlap is minimized. We show how this allows us to compute the
intent of a collection as a distribution over views. We show how effective
retrieval can be performed by prioritizing candidate expansion images that
match the intent of a query collection. Finally, we present a new querying
mechanism for image search enabled by composing multiple collections and
perform retrieval under this setting using the techniques presented in this
paper.
Related papers
- Generative Retrieval as Multi-Vector Dense Retrieval [71.75503049199897]
Generative retrieval generates identifiers of relevant documents in an end-to-end manner.
Prior work has demonstrated that generative retrieval with atomic identifiers is equivalent to single-vector dense retrieval.
We show that generative retrieval and multi-vector dense retrieval share the same framework for measuring the relevance to a query of a document.
arXiv Detail & Related papers (2024-03-31T13:29:43Z) - Object-Centric Open-Vocabulary Image-Retrieval with Aggregated Features [12.14013374452918]
We present a simple yet effective approach to object-centric open-vocabulary image retrieval.
Our approach aggregates dense embeddings extracted from CLIP into a compact representation.
We show the effectiveness of our scheme to the task by achieving significantly better results than global feature approaches on three datasets.
arXiv Detail & Related papers (2023-09-26T15:13:09Z) - Integrating Visual and Semantic Similarity Using Hierarchies for Image
Retrieval [0.46040036610482665]
We propose a method for CBIR that captures both visual and semantic similarity using a visual hierarchy.
The hierarchy is constructed by merging classes with overlapping features in the latent space of a deep neural network trained for classification.
Our method achieves superior performance compared to the existing methods on image retrieval.
arXiv Detail & Related papers (2023-08-16T15:23:14Z) - Pattern Spotting and Image Retrieval in Historical Documents using Deep
Hashing [60.67014034968582]
This paper presents a deep learning approach for image retrieval and pattern spotting in digital collections of historical documents.
Deep learning models are used for feature extraction, considering two distinct variants, which provide either real-valued or binary code representations.
The proposed approach also reduces the search time by up to 200x and the storage cost up to 6,000x when compared to related works.
arXiv Detail & Related papers (2022-08-04T01:39:37Z) - Probabilistic Compositional Embeddings for Multimodal Image Retrieval [48.450232527041436]
We investigate a more challenging scenario for composing multiple multimodal queries in image retrieval.
Given an arbitrary number of query images and (or) texts, our goal is to retrieve target images containing the semantic concepts specified in multiple multimodal queries.
We propose a novel multimodal probabilistic composer (MPC) to learn an informative embedding that can flexibly encode the semantics of various queries.
arXiv Detail & Related papers (2022-04-12T14:45:37Z) - ARTEMIS: Attention-based Retrieval with Text-Explicit Matching and
Implicit Similarity [16.550790981646276]
Current approaches combine the features of each of the two elements of the query into a single representation.
Our work aims at shedding new light on the task by looking at it through the prism of two familiar and related frameworks: text-to-image and image-to-image retrieval.
arXiv Detail & Related papers (2022-03-15T17:29:20Z) - Contextual Similarity Aggregation with Self-attention for Visual
Re-ranking [96.55393026011811]
We propose a visual re-ranking method by contextual similarity aggregation with self-attention.
We conduct comprehensive experiments on four benchmark datasets to demonstrate the generality and effectiveness of our proposed visual re-ranking method.
arXiv Detail & Related papers (2021-10-26T06:20:31Z) - Compositional Sketch Search [91.84489055347585]
We present an algorithm for searching image collections using free-hand sketches.
We exploit drawings as a concise and intuitive representation for specifying entire scene compositions.
arXiv Detail & Related papers (2021-06-15T09:38:09Z) - Compact Deep Aggregation for Set Retrieval [87.52470995031997]
We focus on retrieving images containing multiple faces from a large scale dataset of images.
Here the set consists of the face descriptors in each image, and given a query for multiple identities, the goal is then to retrieve, in order, images which contain all the identities.
We show that this compact descriptor has minimal loss of discriminability up to two faces per image, and degrades slowly after that.
arXiv Detail & Related papers (2020-03-26T08:43:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.