Related papers: Large-to-small Image Resolution Asymmetry in Deep Metric Learning

Large-to-small Image Resolution Asymmetry in Deep Metric Learning

URL: http://arxiv.org/abs/2210.05463v1
Date: Tue, 11 Oct 2022 14:05:30 GMT
Title: Large-to-small Image Resolution Asymmetry in Deep Metric Learning
Authors: Pavel Suma, Giorgos Tolias
Abstract summary: We explore an asymmetric setup by light-weight processing of the query at a small image resolution to enable fast representation extraction. The goal is to obtain a network for database examples that is trained to operate on large resolution images and benefits from fine-grained image details. We conclude that resolution asymmetry is a better way to optimize the performance/efficiency trade-off than architecture asymmetry.
Score: 13.81293627340993
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep metric learning for vision is trained by optimizing a representation network to map (non-)matching image pairs to (non-)similar representations. During testing, which typically corresponds to image retrieval, both database and query examples are processed by the same network to obtain the representation used for similarity estimation and ranking. In this work, we explore an asymmetric setup by light-weight processing of the query at a small image resolution to enable fast representation extraction. The goal is to obtain a network for database examples that is trained to operate on large resolution images and benefits from fine-grained image details, and a second network for query examples that operates on small resolution images but preserves a representation space aligned with that of the database network. We achieve this with a distillation approach that transfers knowledge from a fixed teacher network to a student via a loss that operates per image and solely relies on coupled augmentations without the use of any labels. In contrast to prior work that explores such asymmetry from the point of view of different network architectures, this work uses the same architecture but modifies the image resolution. We conclude that resolution asymmetry is a better way to optimize the performance/efficiency trade-off than architecture asymmetry. Evaluation is performed on three standard deep metric learning benchmarks, namely CUB200, Cars196, and SOP. Code: https://github.com/pavelsuma/raml

Related papers

Parameter-Inverted Image Pyramid Networks [49.35689698870247]
We propose a novel network architecture known as the Inverted Image Pyramid Networks (PIIP) Our core idea is to use models with different parameter sizes to process different resolution levels of the image pyramid. PIIP achieves superior performance in tasks such as object detection, segmentation, and image classification.
arXiv Detail & Related papers (2024-06-06T17:59:10Z)
CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition [73.51329037954866]
We propose a robust global representation method with cross-image correlation awareness for visual place recognition. Our method uses the attention mechanism to correlate multiple images within a batch. Our method outperforms state-of-the-art methods by a large margin with significantly less training time.
arXiv Detail & Related papers (2024-02-29T15:05:11Z)
Asymmetric Hash Code Learning for Remote Sensing Image Retrieval [22.91678927865952]
We propose a novel deep hashing method, named asymmetric hash code learning (AHCL), for remote sensing image retrieval. The AHCL generates the hash codes of query and database images in an asymmetric way. The experimental results on three public datasets demonstrate that the proposed method outperforms symmetric methods in terms of retrieval accuracy and efficiency.
arXiv Detail & Related papers (2022-01-15T07:00:38Z)
ACORN: Adaptive Coordinate Networks for Neural Scene Representation [40.04760307540698]
Current neural representations fail to accurately represent images at resolutions greater than a megapixel or 3D scenes with more than a few hundred thousand polygons. We introduce a new hybrid implicit-explicit network architecture and training strategy that adaptively allocates resources during training and inference. We demonstrate the first experiments that fit gigapixel images to nearly 40 dB peak signal-to-noise ratio.
arXiv Detail & Related papers (2021-05-06T16:21:38Z)
Principled network extraction from images [0.0]
We present a principled model to extract network topologies from images that is scalable and efficient. We test our model on real images of the retinal vascular system, slime mold and river networks.
arXiv Detail & Related papers (2020-12-23T15:56:09Z)
Seed the Views: Hierarchical Semantic Alignment for Contrastive Representation Learning [116.91819311885166]
We propose a hierarchical semantic alignment strategy via expanding the views generated by a single image to textbfCross-samples and Multi-level representation. Our method, termed as CsMl, has the ability to integrate multi-level visual representations across samples in a robust way.
arXiv Detail & Related papers (2020-12-04T17:26:24Z)
A deep primal-dual proximal network for image restoration [8.797434238081372]
We design a deep network, named DeepPDNet, built from primal-dual iterations associated with the minimization of a standard penalized likelihood with an analysis prior. Two different learning strategies: "Full learning" and "Partial learning" are proposed, the first one is the most efficient numerically. Extensive results show that the proposed DeepPDNet demonstrates excellent performance on the MNIST and the more complex BSD68, BSD100, and SET14 datasets for image restoration and single image super-resolution task.
arXiv Detail & Related papers (2020-07-02T08:29:52Z)
RANSAC-Flow: generic two-stage image alignment [53.11926395028508]
We show that a simple unsupervised approach performs surprisingly well across a range of tasks. Despite its simplicity, our method shows competitive results on a range of tasks and datasets.
arXiv Detail & Related papers (2020-04-03T12:37:58Z)
Geometrically Mappable Image Features [85.81073893916414]
Vision-based localization of an agent in a map is an important problem in robotics and computer vision. We propose a method that learns image features targeted for image-retrieval-based localization.
arXiv Detail & Related papers (2020-03-21T15:36:38Z)
Asymmetric Distribution Measure for Few-shot Learning [82.91276814477126]
metric-based few-shot image classification aims to measure the relations between query images and support classes. We propose a novel Asymmetric Distribution Measure (ADM) network for few-shot learning. We achieve $3.02%$ and $1.56%$ gains over the state-of-the-art method on the $5$-way $1$-shot task.
arXiv Detail & Related papers (2020-02-01T06:41:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.