Residual Learning for Image Point Descriptors
- URL: http://arxiv.org/abs/2312.15471v1
- Date: Sun, 24 Dec 2023 12:51:30 GMT
- Title: Residual Learning for Image Point Descriptors
- Authors: Rashik Shrestha, Ajad Chhatkuli, Menelaos Kanakis, Luc Van Gool
- Abstract summary: We propose a very simple and effective approach to learning local image descriptors by using a hand-crafted detector and descriptor.
We optimize the final descriptor by leveraging the knowledge already present in the handcrafted descriptor.
Our approach has potential applications in ensemble learning and learning with non-differentiable functions.
- Score: 56.917951170421894
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Local image feature descriptors have had a tremendous impact on the
development and application of computer vision methods. It is therefore
unsurprising that significant efforts are being made for learning-based image
point descriptors. However, the advantage of learned methods over handcrafted
methods in real applications is subtle and more nuanced than expected.
Moreover, handcrafted descriptors such as SIFT and SURF still perform better
point localization in Structure-from-Motion (SfM) compared to many learned
counterparts. In this paper, we propose a very simple and effective approach to
learning local image descriptors by using a hand-crafted detector and
descriptor. Specifically, we choose to learn only the descriptors, supported by
handcrafted descriptors while discarding the point localization head. We
optimize the final descriptor by leveraging the knowledge already present in
the handcrafted descriptor. Such an approach of optimization allows us to
discard learning knowledge already present in non-differentiable functions such
as the hand-crafted descriptors and only learn the residual knowledge in the
main network branch. This offers 50X convergence speed compared to the standard
baseline architecture of SuperPoint while at inference the combined descriptor
provides superior performance over the learned and hand-crafted descriptors.
This is done with minor increase in the computations over the baseline learned
descriptor. Our approach has potential applications in ensemble learning and
learning with non-differentiable functions. We perform experiments in matching,
camera localization and Structure-from-Motion in order to showcase the
advantages of our approach.
Related papers
- BEBLID: Boosted efficient binary local image descriptor [2.8538628855541397]
We introduce BEBLID, an efficient learned binary image descriptor.
It improves our previous real-valued descriptor, BELID, making it both more efficient for matching and more accurate.
In experiments BEBLID achieves an accuracy close to SIFT and better computational efficiency than ORB, the fastest algorithm in the literature.
arXiv Detail & Related papers (2024-02-07T00:14:32Z) - FeatureBooster: Boosting Feature Descriptors with a Lightweight Neural
Network [16.10404845106396]
We introduce a lightweight network to improve descriptors of keypoints within the same image.
The network takes the original descriptors and the geometric properties of keypoints as the input.
We use the proposed network to boost both hand-crafted (ORB, SIFT) and the state-of-the-art learning-based descriptors.
arXiv Detail & Related papers (2022-11-28T05:06:03Z) - Learning-Based Dimensionality Reduction for Computing Compact and
Effective Local Feature Descriptors [101.62384271200169]
A distinctive representation of image patches in form of features is a key component of many computer vision and robotics tasks.
We investigate multi-layer perceptrons (MLPs) to extract low-dimensional but high-quality descriptors.
We consider different applications, including visual localization, patch verification, image matching and retrieval.
arXiv Detail & Related papers (2022-09-27T17:59:04Z) - No Token Left Behind: Explainability-Aided Image Classification and
Generation [79.4957965474334]
We present a novel explainability-based approach, which adds a loss term to ensure that CLIP focuses on all relevant semantic parts of the input.
Our method yields an improvement in the recognition rate, without additional training or fine-tuning.
arXiv Detail & Related papers (2022-04-11T07:16:39Z) - UPDesc: Unsupervised Point Descriptor Learning for Robust Registration [54.95201961399334]
UPDesc is an unsupervised method to learn point descriptors for robust point cloud registration.
We show that our learned descriptors yield superior performance over existing unsupervised methods.
arXiv Detail & Related papers (2021-08-05T17:11:08Z) - Learning and aggregating deep local descriptors for instance-level
recognition [11.692327697598175]
Training only requires examples of positive and negative image pairs.
At inference, the local descriptors are provided by the activations of internal components of the network.
We achieve state-of-the-art performance, in some cases even with a backbone network as small as ResNet18.
arXiv Detail & Related papers (2020-07-26T16:30:56Z) - Learning Feature Descriptors using Camera Pose Supervision [101.56783569070221]
We propose a novel weakly-supervised framework that can learn feature descriptors solely from relative camera poses between images.
Because we no longer need pixel-level ground-truth correspondences, our framework opens up the possibility of training on much larger and more diverse datasets for better and unbiased descriptors.
arXiv Detail & Related papers (2020-04-28T06:35:27Z) - Image Matching across Wide Baselines: From Paper to Practice [80.9424750998559]
We introduce a comprehensive benchmark for local features and robust estimation algorithms.
Our pipeline's modular structure allows easy integration, configuration, and combination of different methods.
We show that with proper settings, classical solutions may still outperform the perceived state of the art.
arXiv Detail & Related papers (2020-03-03T15:20:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.