Related papers: Learning-Based Dimensionality Reduction for Computing Compact and Effective Local Feature Descriptors

Learning-Based Dimensionality Reduction for Computing Compact and Effective Local Feature Descriptors

URL: http://arxiv.org/abs/2209.13586v1
Date: Tue, 27 Sep 2022 17:59:04 GMT
Title: Learning-Based Dimensionality Reduction for Computing Compact and Effective Local Feature Descriptors
Authors: Hao Dong, Xieyuanli Chen, Mihai Dusmanu, Viktor Larsson, Marc Pollefeys and Cyrill Stachniss
Abstract summary: A distinctive representation of image patches in form of features is a key component of many computer vision and robotics tasks. We investigate multi-layer perceptrons (MLPs) to extract low-dimensional but high-quality descriptors. We consider different applications, including visual localization, patch verification, image matching and retrieval.
Score: 101.62384271200169
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A distinctive representation of image patches in form of features is a key component of many computer vision and robotics tasks, such as image matching, image retrieval, and visual localization. State-of-the-art descriptors, from hand-crafted descriptors such as SIFT to learned ones such as HardNet, are usually high dimensional; 128 dimensions or even more. The higher the dimensionality, the larger the memory consumption and computational time for approaches using such descriptors. In this paper, we investigate multi-layer perceptrons (MLPs) to extract low-dimensional but high-quality descriptors. We thoroughly analyze our method in unsupervised, self-supervised, and supervised settings, and evaluate the dimensionality reduction results on four representative descriptors. We consider different applications, including visual localization, patch verification, image matching and retrieval. The experiments show that our lightweight MLPs achieve better dimensionality reduction than PCA. The lower-dimensional descriptors generated by our approach outperform the original higher-dimensional descriptors in downstream tasks, especially for the hand-crafted ones. The code will be available at https://github.com/PRBonn/descriptor-dr.

Related papers

Parameter-Inverted Image Pyramid Networks [49.35689698870247]
We propose a novel network architecture known as the Inverted Image Pyramid Networks (PIIP) Our core idea is to use models with different parameter sizes to process different resolution levels of the image pyramid. PIIP achieves superior performance in tasks such as object detection, segmentation, and image classification.
arXiv Detail & Related papers (2024-06-06T17:59:10Z)
Efficient Visual State Space Model for Image Deblurring [99.54894198086852]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration.<n>We propose a simple yet effective visual state space model (EVSSM) for image deblurring.<n>The proposed EVSSM performs favorably against state-of-the-art methods on benchmark datasets and real-world images.
arXiv Detail & Related papers (2024-05-23T09:13:36Z)
Residual Learning for Image Point Descriptors [56.917951170421894]
We propose a very simple and effective approach to learning local image descriptors by using a hand-crafted detector and descriptor. We optimize the final descriptor by leveraging the knowledge already present in the handcrafted descriptor. Our approach has potential applications in ensemble learning and learning with non-differentiable functions.
arXiv Detail & Related papers (2023-12-24T12:51:30Z)
ALIKED: A Lighter Keypoint and Descriptor Extraction Network via Deformable Transformation [27.04762347838776]
We propose the Sparse Deformable Descriptor Head (SDDH), which learns the deformable positions of supporting features for each keypoint and constructs deformable descriptors. We show that the proposed network is both efficient and powerful in various visual measurement tasks, including image matching, 3D reconstruction, and visual relocalization.
arXiv Detail & Related papers (2023-04-07T12:05:39Z)
ZippyPoint: Fast Interest Point Detection, Description, and Matching through Mixed Precision Discretization [71.91942002659795]
We investigate and adapt network quantization techniques to accelerate inference and enable its use on compute limited platforms. ZippyPoint, our efficient quantized network with binary descriptors, improves the network runtime speed, the descriptor matching speed, and the 3D model size. These improvements come at a minor performance degradation as evaluated on the tasks of homography estimation, visual localization, and map-free visual relocalization.
arXiv Detail & Related papers (2022-03-07T18:59:03Z)
Local Quadruple Pattern: A Novel Descriptor for Facial Image Recognition and Retrieval [20.77994516381]
A novel hand crafted local quadruple pattern (LQPAT) is proposed for facial image recognition and retrieval. The proposed descriptor encodes relations amongst the neighbours in quadruple space. The retrieval and recognition accuracies of the proposed descriptor has been compared with state of the art hand crafted descriptors on bench mark databases.
arXiv Detail & Related papers (2022-01-03T08:04:38Z)
The Intrinsic Dimension of Images and Its Impact on Learning [60.811039723427676]
It is widely believed that natural image data exhibits low-dimensional structure despite the high dimensionality of conventional pixel representations. In this work, we apply dimension estimation tools to popular datasets and investigate the role of low-dimensional structure in deep learning.
arXiv Detail & Related papers (2021-04-18T16:29:23Z)
Hyperdimensional computing as a framework for systematic aggregation of image descriptors [4.56877715768796]
We use hyperdimensional computing (HDC) as an approach to combine information from a set of vectors in a single vector of the same dimensionality. We present a HDC implementation that is suitable for processing the output of existing and future (deep-learning based) image descriptors.
arXiv Detail & Related papers (2021-01-19T16:49:58Z)
Learning Feature Descriptors using Camera Pose Supervision [101.56783569070221]
We propose a novel weakly-supervised framework that can learn feature descriptors solely from relative camera poses between images. Because we no longer need pixel-level ground-truth correspondences, our framework opens up the possibility of training on much larger and more diverse datasets for better and unbiased descriptors.
arXiv Detail & Related papers (2020-04-28T06:35:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.