Learning to Make Keypoints Sub-Pixel Accurate
- URL: http://arxiv.org/abs/2407.11668v1
- Date: Tue, 16 Jul 2024 12:39:56 GMT
- Title: Learning to Make Keypoints Sub-Pixel Accurate
- Authors: Shinjeong Kim, Marc Pollefeys, Daniel Barath,
- Abstract summary: This work addresses the challenge of sub-pixel accuracy in detecting 2D local features.
We propose a novel network that enhances any detector with sub-pixel precision by learning an offset vector for detected features.
- Score: 80.55676599677824
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This work addresses the challenge of sub-pixel accuracy in detecting 2D local features, a cornerstone problem in computer vision. Despite the advancements brought by neural network-based methods like SuperPoint and ALIKED, these modern approaches lag behind classical ones such as SIFT in keypoint localization accuracy due to their lack of sub-pixel precision. We propose a novel network that enhances any detector with sub-pixel precision by learning an offset vector for detected features, thereby eliminating the need for designing specialized sub-pixel accurate detectors. This optimization directly minimizes test-time evaluation metrics like relative pose error. Through extensive testing with both nearest neighbors matching and the recent LightGlue matcher across various real-world datasets, our method consistently outperforms existing methods in accuracy. Moreover, it adds only around 7 ms to the time of a particular detector. The code is available at https://github.com/KimSinjeong/keypt2subpx .
Related papers
- Simplifying Two-Stage Detectors for On-Device Inference in Remote Sensing [0.7305342793164903]
We propose a model simplification method for two-stage object detectors.
Our method reduces computation costs upto 61.2% with the accuracy loss within 2.1% on the DOTAv1.5 dataset.
arXiv Detail & Related papers (2024-04-11T00:45:10Z) - Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - Improving the matching of deformable objects by learning to detect
keypoints [6.4587163310833855]
We propose a novel learned keypoint detection method to increase the number of correct matches for the task of non-rigid image correspondence.
We train an end-to-end convolutional neural network (CNN) to find keypoint locations that are more appropriate to the considered descriptor.
Experiments demonstrate that our method enhances the Mean Matching Accuracy of numerous descriptors when used in conjunction with our detection method.
We also apply our method on the complex real-world task object retrieval where our detector performs on par with the finest keypoint detectors currently available for this task.
arXiv Detail & Related papers (2023-09-01T13:02:19Z) - DAC: Detector-Agnostic Spatial Covariances for Deep Local Features [11.494662473750505]
Current deep visual local feature detectors do not model the spatial uncertainty of detected features.
We propose two post-hoc covariance estimates that can be plugged into any pretrained deep feature detector.
arXiv Detail & Related papers (2023-05-20T17:43:09Z) - Learning to Detect Good Keypoints to Match Non-Rigid Objects in RGB
Images [7.428474910083337]
We present a novel learned keypoint detection method designed to maximize the number of correct matches for the task of non-rigid image correspondence.
Our training framework uses true correspondences, obtained by matching annotated image pairs with a predefined descriptor extractor, as a ground-truth to train a convolutional neural network (CNN)
Experiments show that our method outperforms the state-of-the-art keypoint detector on real images of non-rigid objects by 20 p.p. on Mean Matching Accuracy.
arXiv Detail & Related papers (2022-12-13T11:59:09Z) - ALIKE: Accurate and Lightweight Keypoint Detection and Descriptor
Extraction [21.994171434960734]
We present a differentiable keypoint detection module, which outputs accurate sub-pixel keypoints.
The reprojection loss is then proposed to directly optimize these sub-pixel keypoints, and the dispersity peak loss is presented for accurate keypoints regularization.
A lightweight network is designed for keypoint detection and descriptor extraction, which can run at 95 frames per second for 640x480 images on a commercial GPU.
arXiv Detail & Related papers (2021-12-06T10:10:30Z) - Pixel-Perfect Structure-from-Motion with Featuremetric Refinement [96.73365545609191]
We refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views.
This significantly improves the accuracy of camera poses and scene geometry for a wide range of keypoint detectors.
Our system easily scales to large image collections, enabling pixel-perfect crowd-sourced localization at scale.
arXiv Detail & Related papers (2021-08-18T17:58:55Z) - Uncertainty-Aware Camera Pose Estimation from Points and Lines [101.03675842534415]
Perspective-n-Point-and-Line (Pn$PL) aims at fast, accurate and robust camera localizations with respect to a 3D model from 2D-3D feature coordinates.
arXiv Detail & Related papers (2021-07-08T15:19:36Z) - Soft Expectation and Deep Maximization for Image Feature Detection [68.8204255655161]
We propose SEDM, an iterative semi-supervised learning process that flips the question and first looks for repeatable 3D points, then trains a detector to localize them in image space.
Our results show that this new model trained using SEDM is able to better localize the underlying 3D points in a scene.
arXiv Detail & Related papers (2021-04-21T00:35:32Z) - Dense Label Encoding for Boundary Discontinuity Free Rotation Detection [69.75559390700887]
This paper explores a relatively less-studied methodology based on classification.
We propose new techniques to push its frontier in two aspects.
Experiments and visual analysis on large-scale public datasets for aerial images show the effectiveness of our approach.
arXiv Detail & Related papers (2020-11-19T05:42:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.