ALIKE: Accurate and Lightweight Keypoint Detection and Descriptor
Extraction
- URL: http://arxiv.org/abs/2112.02906v1
- Date: Mon, 6 Dec 2021 10:10:30 GMT
- Title: ALIKE: Accurate and Lightweight Keypoint Detection and Descriptor
Extraction
- Authors: Xiaoming Zhao, Xingming Wu, Jinyu Miao, Weihai Chen, Peter C. Y. Chen,
and Zhengguo Li
- Abstract summary: We present a differentiable keypoint detection module, which outputs accurate sub-pixel keypoints.
The reprojection loss is then proposed to directly optimize these sub-pixel keypoints, and the dispersity peak loss is presented for accurate keypoints regularization.
A lightweight network is designed for keypoint detection and descriptor extraction, which can run at 95 frames per second for 640x480 images on a commercial GPU.
- Score: 21.994171434960734
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Existing methods detect the keypoints in a non-differentiable way, therefore
they can not directly optimize the position of keypoints through
back-propagation. To address this issue, we present a differentiable keypoint
detection module, which outputs accurate sub-pixel keypoints. The reprojection
loss is then proposed to directly optimize these sub-pixel keypoints, and the
dispersity peak loss is presented for accurate keypoints regularization. We
also extract the descriptors in a sub-pixel way, and they are trained with the
stable neural reprojection error loss. Moreover, a lightweight network is
designed for keypoint detection and descriptor extraction, which can run at 95
frames per second for 640x480 images on a commercial GPU. On homography
estimation, camera pose estimation, and visual (re-)localization tasks, the
proposed method achieves equivalent performance with the state-of-the-art
approaches, while greatly reduces the inference time.
Related papers
- Learning to Make Keypoints Sub-Pixel Accurate [80.55676599677824]
This work addresses the challenge of sub-pixel accuracy in detecting 2D local features.
We propose a novel network that enhances any detector with sub-pixel precision by learning an offset vector for detected features.
arXiv Detail & Related papers (2024-07-16T12:39:56Z) - Single Image Depth Prediction Made Better: A Multivariate Gaussian Take [163.14849753700682]
We introduce an approach that performs continuous modeling of per-pixel depth.
Our method's accuracy (named MG) is among the top on the KITTI depth-prediction benchmark leaderboard.
arXiv Detail & Related papers (2023-03-31T16:01:03Z) - KGNv2: Separating Scale and Pose Prediction for Keypoint-based 6-DoF
Grasp Synthesis on RGB-D input [16.897624250286487]
Keypoint-based grasp detector from image input has demonstrated promising results.
We devise a new grasp generation network that reduces the dependency on precise keypoint estimation.
arXiv Detail & Related papers (2023-03-09T23:11:52Z) - BALF: Simple and Efficient Blur Aware Local Feature Detector [14.044093492945334]
Local feature detection is a key ingredient of many image processing and computer vision applications.
We propose a simple yet both efficient and effective keypoint detection method that is able to accurately localize the salient keypoints in a blurred image.
Our method takes advantages of a novel multi-layer perceptron (MLP) based architecture that significantly improve the detection repeatability for a blurred image.
arXiv Detail & Related papers (2022-11-27T05:29:57Z) - Self-Supervised Equivariant Learning for Oriented Keypoint Detection [35.94215211409985]
We introduce a self-supervised learning framework using rotation-equivariant CNNs to learn to detect robust oriented keypoints.
We propose a dense orientation alignment loss by an image pair generated by synthetic transformations for training a histogram-based orientation map.
Our method outperforms the previous methods on an image matching benchmark and a camera pose estimation benchmark.
arXiv Detail & Related papers (2022-04-19T02:26:07Z) - Rethinking Keypoint Representations: Modeling Keypoints and Poses as
Objects for Multi-Person Human Pose Estimation [79.78017059539526]
We propose a new heatmap-free keypoint estimation method in which individual keypoints and sets of spatially related keypoints (i.e., poses) are modeled as objects within a dense single-stage anchor-based detection framework.
In experiments, we observe that KAPAO is significantly faster and more accurate than previous methods, which suffer greatly from heatmap post-processing.
Our large model, KAPAO-L, achieves an AP of 70.6 on the Microsoft COCO Keypoints validation set without test-time augmentation.
arXiv Detail & Related papers (2021-11-16T15:36:44Z) - Pixel-Perfect Structure-from-Motion with Featuremetric Refinement [96.73365545609191]
We refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views.
This significantly improves the accuracy of camera poses and scene geometry for a wide range of keypoint detectors.
Our system easily scales to large image collections, enabling pixel-perfect crowd-sourced localization at scale.
arXiv Detail & Related papers (2021-08-18T17:58:55Z) - You Better Look Twice: a new perspective for designing accurate
detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture.
It reduces computations by separating objects from background using a very lite first-stage.
Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z) - OSKDet: Towards Orientation-sensitive Keypoint Localization for Rotated
Object Detection [0.0]
We propose an orientation-sensitive keypoint based rotated detector OSKDet.
We adopt a set of keypoints to characterize the target and predict the keypoint heatmap on ROI to form a rotated target.
We achieve an AP of 77.81% on DOTA, 89.91% on HRSC2016, and 97.18% on UCAS-AOD, respectively.
arXiv Detail & Related papers (2021-04-18T03:40:52Z) - 6DoF Object Pose Estimation via Differentiable Proxy Voting Loss [113.72905482334767]
We develop a differentiable proxy voting loss (DPVL) which mimics the hypothesis selection in the voting procedure.
Experiments on widely used datasets, i.e., LINEMOD and Occlusion LINEMOD, manifest that our DPVL improves pose estimation performance significantly.
arXiv Detail & Related papers (2020-02-10T16:33:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.