AMES: Asymmetric and Memory-Efficient Similarity Estimation for Instance-level Retrieval
- URL: http://arxiv.org/abs/2408.03282v1
- Date: Tue, 6 Aug 2024 16:29:51 GMT
- Title: AMES: Asymmetric and Memory-Efficient Similarity Estimation for Instance-level Retrieval
- Authors: Pavel Suma, Giorgos Kordopatis-Zilos, Ahmet Iscen, Giorgos Tolias,
- Abstract summary: This work investigates the problem of instance-level image retrieval re-ranking with the constraint of memory efficiency.
The proposed model uses a transformer-based architecture designed to estimate image-to-image similarity.
Results on standard benchmarks demonstrate the superiority of our approach over both hand-crafted and learned models.
- Score: 14.009257997448634
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work investigates the problem of instance-level image retrieval re-ranking with the constraint of memory efficiency, ultimately aiming to limit memory usage to 1KB per image. Departing from the prevalent focus on performance enhancements, this work prioritizes the crucial trade-off between performance and memory requirements. The proposed model uses a transformer-based architecture designed to estimate image-to-image similarity by capturing interactions within and across images based on their local descriptors. A distinctive property of the model is the capability for asymmetric similarity estimation. Database images are represented with a smaller number of descriptors compared to query images, enabling performance improvements without increasing memory consumption. To ensure adaptability across different applications, a universal model is introduced that adjusts to a varying number of local descriptors during the testing phase. Results on standard benchmarks demonstrate the superiority of our approach over both hand-crafted and learned models. In particular, compared with current state-of-the-art methods that overlook their memory footprint, our approach not only attains superior performance but does so with a significantly reduced memory footprint. The code and pretrained models are publicly available at: https://github.com/pavelsuma/ames
Related papers
- Addressing Issues with Working Memory in Video Object Segmentation [37.755852787082254]
Video object segmentation (VOS) models compare incoming unannotated images to a history of image-mask relations.
Current state of the art models perform very well on clean video data.
Their reliance on a working memory of previous frames leaves room for error.
A simple algorithmic change is proposed that can be applied to any existing working memory-based VOS model.
arXiv Detail & Related papers (2024-10-29T18:34:41Z) - Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels.
They are not widely adopted by general users due to their substantial storage requirements.
We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z) - GeneCIS: A Benchmark for General Conditional Image Similarity [21.96493413291777]
We argue that there are many notions of'similarity' and that models, like humans, should be able to adapt to these dynamically.
We propose the GeneCIS benchmark, which measures models' ability to adapt to a range of similarity conditions.
arXiv Detail & Related papers (2023-06-13T17:59:58Z) - Improving Image Recognition by Retrieving from Web-Scale Image-Text Data [68.63453336523318]
We introduce an attention-based memory module, which learns the importance of each retrieved example from the memory.
Compared to existing approaches, our method removes the influence of the irrelevant retrieved examples, and retains those that are beneficial to the input query.
We show that it achieves state-of-the-art accuracies in ImageNet-LT, Places-LT and Webvision datasets.
arXiv Detail & Related papers (2023-04-11T12:12:05Z) - Asymmetric Image Retrieval with Cross Model Compatible Ensembles [4.86935886318034]
asymmetrical retrieval is a well suited solution for resource constrained applications such as face recognition and image retrieval.
We present an approach that does not rely on knowledge distillation, rather it utilizes embedding transformation models.
We improve the overall accuracy beyond that of any single model while maintaining a low computational budget for querying.
arXiv Detail & Related papers (2023-03-30T16:53:07Z) - A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental
Learning [56.450090618578]
Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement.
We show that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work.
We propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel.
arXiv Detail & Related papers (2022-05-26T08:24:01Z) - Recall@k Surrogate Loss with Large Batches and Similarity Mixup [62.67458021725227]
Direct optimization, by gradient descent, of an evaluation metric is not possible when it is non-differentiable.
In this work, a differentiable surrogate loss for the recall is proposed.
The proposed method achieves state-of-the-art results in several image retrieval benchmarks.
arXiv Detail & Related papers (2021-08-25T11:09:11Z) - Multiscale Deep Equilibrium Models [162.15362280927476]
We propose a new class of implicit networks, the multiscale deep equilibrium model (MDEQ)
An MDEQ directly solves for and backpropagates through the equilibrium points of multiple feature resolutions simultaneously.
We illustrate the effectiveness of this approach on two large-scale vision tasks: ImageNet classification and semantic segmentation on high-resolution images from the Cityscapes dataset.
arXiv Detail & Related papers (2020-06-15T18:07:44Z) - Memory-Efficient Incremental Learning Through Feature Adaptation [71.1449769528535]
We introduce an approach for incremental learning that preserves feature descriptors of training images from previously learned classes.
Keeping the much lower-dimensional feature embeddings of images reduces the memory footprint significantly.
Experimental results show that our method achieves state-of-the-art classification accuracy in incremental learning benchmarks.
arXiv Detail & Related papers (2020-04-01T21:16:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.