Learning and aggregating deep local descriptors for instance-level
recognition
- URL: http://arxiv.org/abs/2007.13172v1
- Date: Sun, 26 Jul 2020 16:30:56 GMT
- Title: Learning and aggregating deep local descriptors for instance-level
recognition
- Authors: Giorgos Tolias, Tomas Jenicek, Ond\v{r}ej Chum
- Abstract summary: Training only requires examples of positive and negative image pairs.
At inference, the local descriptors are provided by the activations of internal components of the network.
We achieve state-of-the-art performance, in some cases even with a backbone network as small as ResNet18.
- Score: 11.692327697598175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose an efficient method to learn deep local descriptors for
instance-level recognition. The training only requires examples of positive and
negative image pairs and is performed as metric learning of sum-pooled global
image descriptors. At inference, the local descriptors are provided by the
activations of internal components of the network. We demonstrate why such an
approach learns local descriptors that work well for image similarity
estimation with classical efficient match kernel methods. The experimental
validation studies the trade-off between performance and memory requirements of
the state-of-the-art image search approach based on match kernels. Compared to
existing local descriptors, the proposed ones perform better in two
instance-level recognition tasks and keep memory requirements lower. We
experimentally show that global descriptors are not effective enough at large
scale and that local descriptors are essential. We achieve state-of-the-art
performance, in some cases even with a backbone network as small as ResNet18.
Related papers
- Leveraging Semantic Cues from Foundation Vision Models for Enhanced Local Feature Correspondence [12.602194710071116]
This paper presents a new method that uses semantic cues from foundation vision model features to enhance local feature matching.
We present adapted versions of six existing descriptors, with an average increase in performance of 29% in camera localization.
arXiv Detail & Related papers (2024-10-12T13:45:26Z) - FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization [57.59857784298536]
Direct 2D-3D matching algorithms require significantly less memory but suffer from lower accuracy due to the larger and more ambiguous search space.
We address this ambiguity by fusing local and global descriptors using a weighted average operator within a 2D-3D search framework.
We consistently improve the accuracy over local-only systems and achieve performance close to hierarchical methods while halving memory requirements.
arXiv Detail & Related papers (2024-08-21T23:42:16Z) - A Simple Task-aware Contrastive Local Descriptor Selection Strategy for Few-shot Learning between inter class and intra class [6.204356280380338]
Few-shot image classification aims to classify novel classes with few labeled samples.
Recent research indicates that deep local descriptors have better representational capabilities.
This paper proposes a novel task-aware contrastive local descriptor selection network (TCDSNet)
arXiv Detail & Related papers (2024-08-12T07:04:52Z) - Residual Learning for Image Point Descriptors [56.917951170421894]
We propose a very simple and effective approach to learning local image descriptors by using a hand-crafted detector and descriptor.
We optimize the final descriptor by leveraging the knowledge already present in the handcrafted descriptor.
Our approach has potential applications in ensemble learning and learning with non-differentiable functions.
arXiv Detail & Related papers (2023-12-24T12:51:30Z) - TALDS-Net: Task-Aware Adaptive Local Descriptors Selection for Few-shot Image Classification [6.204356280380338]
Few-shot image classification aims to classify images from unseen novel classes with few samples.
Recent works demonstrate that deep local descriptors exhibit enhanced representational capabilities compared to image-level features.
We propose a novel Task-Aware Adaptive Local Descriptors Selection Network (TALDS-Net)
arXiv Detail & Related papers (2023-12-09T03:33:14Z) - Learning-Based Dimensionality Reduction for Computing Compact and
Effective Local Feature Descriptors [101.62384271200169]
A distinctive representation of image patches in form of features is a key component of many computer vision and robotics tasks.
We investigate multi-layer perceptrons (MLPs) to extract low-dimensional but high-quality descriptors.
We consider different applications, including visual localization, patch verification, image matching and retrieval.
arXiv Detail & Related papers (2022-09-27T17:59:04Z) - ZippyPoint: Fast Interest Point Detection, Description, and Matching
through Mixed Precision Discretization [71.91942002659795]
We investigate and adapt network quantization techniques to accelerate inference and enable its use on compute limited platforms.
ZippyPoint, our efficient quantized network with binary descriptors, improves the network runtime speed, the descriptor matching speed, and the 3D model size.
These improvements come at a minor performance degradation as evaluated on the tasks of homography estimation, visual localization, and map-free visual relocalization.
arXiv Detail & Related papers (2022-03-07T18:59:03Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - ReMarNet: Conjoint Relation and Margin Learning for Small-Sample Image
Classification [49.87503122462432]
We introduce a novel neural network termed Relation-and-Margin learning Network (ReMarNet)
Our method assembles two networks of different backbones so as to learn the features that can perform excellently in both of the aforementioned two classification mechanisms.
Experiments on four image datasets demonstrate that our approach is effective in learning discriminative features from a small set of labeled samples.
arXiv Detail & Related papers (2020-06-27T13:50:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.