ALIKED: A Lighter Keypoint and Descriptor Extraction Network via
Deformable Transformation
- URL: http://arxiv.org/abs/2304.03608v2
- Date: Sun, 16 Apr 2023 01:22:54 GMT
- Title: ALIKED: A Lighter Keypoint and Descriptor Extraction Network via
Deformable Transformation
- Authors: Xiaoming Zhao, Xingming Wu, Weihai Chen, Peter C. Y. Chen, Qingsong
Xu, and Zhengguo Li
- Abstract summary: We propose the Sparse Deformable Descriptor Head (SDDH), which learns the deformable positions of supporting features for each keypoint and constructs deformable descriptors.
We show that the proposed network is both efficient and powerful in various visual measurement tasks, including image matching, 3D reconstruction, and visual relocalization.
- Score: 27.04762347838776
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Image keypoints and descriptors play a crucial role in many visual
measurement tasks. In recent years, deep neural networks have been widely used
to improve the performance of keypoint and descriptor extraction. However, the
conventional convolution operations do not provide the geometric invariance
required for the descriptor. To address this issue, we propose the Sparse
Deformable Descriptor Head (SDDH), which learns the deformable positions of
supporting features for each keypoint and constructs deformable descriptors.
Furthermore, SDDH extracts descriptors at sparse keypoints instead of a dense
descriptor map, which enables efficient extraction of descriptors with strong
expressiveness. In addition, we relax the neural reprojection error (NRE) loss
from dense to sparse to train the extracted sparse descriptors. Experimental
results show that the proposed network is both efficient and powerful in
various visual measurement tasks, including image matching, 3D reconstruction,
and visual relocalization.
Related papers
- PEVA-Net: Prompt-Enhanced View Aggregation Network for Zero/Few-Shot Multi-View 3D Shape Recognition [8.15444057380305]
We focus on exploiting the large vision-language model, i.e., CLIP, to address zero/few-shot 3D shape recognition.
We propose Prompt-Enhanced View Aggregation Network (PEVA-Net) to simultaneously address zero/few-shot 3D shape recognition.
arXiv Detail & Related papers (2024-04-30T00:16:59Z) - KDD-LOAM: Jointly Learned Keypoint Detector and Descriptors Assisted
LiDAR Odometry and Mapping [9.609585217048664]
We propose a tightly coupled keypoint detector and descriptor based on a multi-task fully convolutional network with a probabilistic detection loss.
Experiments on both indoor and outdoor datasets show that our TCKDD achieves state-of-the-art performance in point cloud registration.
We also design a keypoint detector and descriptors-assisted LiDAR odometry and mapping framework (KDD-LOAM), whose real-time odometry relies on keypoint descriptor matching-based RANSAC.
arXiv Detail & Related papers (2023-09-27T04:10:52Z) - Mini-PointNetPlus: a local feature descriptor in deep learning model for
3d environment perception [7.304195370862869]
We propose a novel local feature descriptor, mini-PointNetPlus, as an alternative for plug-and-play to PointNet.
Our basic idea is to separately project the data points to the individual features considered, each leading to a permutation invariant.
Due to fully utilizing the features by the proposed descriptor, we demonstrate in experiment a considerable performance improvement for 3D perception.
arXiv Detail & Related papers (2023-07-25T07:30:28Z) - Learning-Based Dimensionality Reduction for Computing Compact and
Effective Local Feature Descriptors [101.62384271200169]
A distinctive representation of image patches in form of features is a key component of many computer vision and robotics tasks.
We investigate multi-layer perceptrons (MLPs) to extract low-dimensional but high-quality descriptors.
We consider different applications, including visual localization, patch verification, image matching and retrieval.
arXiv Detail & Related papers (2022-09-27T17:59:04Z) - Neural Implicit Dictionary via Mixture-of-Expert Training [111.08941206369508]
We present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID)
Our NID assembles a group of coordinate-based Impworks which are tuned to span the desired function space.
Our experiments show that, NID can improve reconstruction of 2D images or 3D scenes by 2 orders of magnitude faster with up to 98% less input data.
arXiv Detail & Related papers (2022-07-08T05:07:19Z) - ZippyPoint: Fast Interest Point Detection, Description, and Matching
through Mixed Precision Discretization [71.91942002659795]
We investigate and adapt network quantization techniques to accelerate inference and enable its use on compute limited platforms.
ZippyPoint, our efficient quantized network with binary descriptors, improves the network runtime speed, the descriptor matching speed, and the 3D model size.
These improvements come at a minor performance degradation as evaluated on the tasks of homography estimation, visual localization, and map-free visual relocalization.
arXiv Detail & Related papers (2022-03-07T18:59:03Z) - Robust Place Recognition using an Imaging Lidar [45.37172889338924]
We propose a methodology for robust, real-time place recognition using an imaging lidar.
Our method is truly-invariant and can tackle reverse revisiting and upside-down revisiting.
arXiv Detail & Related papers (2021-03-03T01:08:31Z) - Fine-Grained Dynamic Head for Object Detection [68.70628757217939]
We propose a fine-grained dynamic head to conditionally select a pixel-level combination of FPN features from different scales for each instance.
Experiments demonstrate the effectiveness and efficiency of the proposed method on several state-of-the-art detection benchmarks.
arXiv Detail & Related papers (2020-12-07T08:16:32Z) - Learning Feature Descriptors using Camera Pose Supervision [101.56783569070221]
We propose a novel weakly-supervised framework that can learn feature descriptors solely from relative camera poses between images.
Because we no longer need pixel-level ground-truth correspondences, our framework opens up the possibility of training on much larger and more diverse datasets for better and unbiased descriptors.
arXiv Detail & Related papers (2020-04-28T06:35:27Z) - MGCN: Descriptor Learning using Multiscale GCNs [50.14172863706108]
We present a new non-learned feature that uses graph wavelets to decompose the Dirichlet energy on a surface.
We also propose a new graph convolutional network (MGCN) to transform a non-learned feature to a more discriminative descriptor.
arXiv Detail & Related papers (2020-01-28T17:25:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.