Self-Supervised Equivariant Learning for Oriented Keypoint Detection
- URL: http://arxiv.org/abs/2204.08613v1
- Date: Tue, 19 Apr 2022 02:26:07 GMT
- Title: Self-Supervised Equivariant Learning for Oriented Keypoint Detection
- Authors: Jongmin Lee, Byungjin Kim, Minsu Cho
- Abstract summary: We introduce a self-supervised learning framework using rotation-equivariant CNNs to learn to detect robust oriented keypoints.
We propose a dense orientation alignment loss by an image pair generated by synthetic transformations for training a histogram-based orientation map.
Our method outperforms the previous methods on an image matching benchmark and a camera pose estimation benchmark.
- Score: 35.94215211409985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting robust keypoints from an image is an integral part of many computer
vision problems, and the characteristic orientation and scale of keypoints play
an important role for keypoint description and matching. Existing
learning-based methods for keypoint detection rely on standard
translation-equivariant CNNs but often fail to detect reliable keypoints
against geometric variations. To learn to detect robust oriented keypoints, we
introduce a self-supervised learning framework using rotation-equivariant CNNs.
We propose a dense orientation alignment loss by an image pair generated by
synthetic transformations for training a histogram-based orientation map. Our
method outperforms the previous methods on an image matching benchmark and a
camera pose estimation benchmark.
Related papers
- Design and Identification of Keypoint Patches in Unstructured Environments [7.940068522906917]
Keypoint identification in an image allows direct mapping from raw images to 2D coordinates.
We propose four simple yet distinct designs that consider various scale, rotation and camera projection.
We customize the Superpoint network to ensure robust detection under various types of image degradation.
arXiv Detail & Related papers (2024-10-01T09:05:50Z) - GMM-IKRS: Gaussian Mixture Models for Interpretable Keypoint Refinement and Scoring [9.322937309882022]
Keypoints come with a score permitting to rank them according to their quality.
While learned keypoints often exhibit better properties than handcrafted ones, their scores are not easily interpretable.
We propose a framework that can refine, and at the same time characterize with an interpretable score, the keypoints extracted by any method.
arXiv Detail & Related papers (2024-08-30T09:39:59Z) - Cycle-Correspondence Loss: Learning Dense View-Invariant Visual Features from Unlabeled and Unordered RGB Images [8.789674502390378]
We introduce Cycle-Correspondence Loss (CCL) for view-invariant dense descriptor learning.
The key idea is to autonomously detect valid pixel correspondences by attempting to use a prediction over a new image.
Our evaluation shows that we outperform other self-supervised RGB-only methods, and approach performance of supervised methods.
arXiv Detail & Related papers (2024-06-18T09:44:56Z) - Improving the matching of deformable objects by learning to detect
keypoints [6.4587163310833855]
We propose a novel learned keypoint detection method to increase the number of correct matches for the task of non-rigid image correspondence.
We train an end-to-end convolutional neural network (CNN) to find keypoint locations that are more appropriate to the considered descriptor.
Experiments demonstrate that our method enhances the Mean Matching Accuracy of numerous descriptors when used in conjunction with our detection method.
We also apply our method on the complex real-world task object retrieval where our detector performs on par with the finest keypoint detectors currently available for this task.
arXiv Detail & Related papers (2023-09-01T13:02:19Z) - Learning-based Relational Object Matching Across Views [63.63338392484501]
We propose a learning-based approach which combines local keypoints with novel object-level features for matching object detections between RGB images.
We train our object-level matching features based on appearance and inter-frame and cross-frame spatial relations between objects in an associative graph neural network.
arXiv Detail & Related papers (2023-05-03T19:36:51Z) - Pixel-level Correspondence for Self-Supervised Learning from Video [56.24439897867531]
Pixel-level Correspondence (PiCo) is a method for dense contrastive learning from video.
We validate PiCo on standard benchmarks, outperforming self-supervised baselines on multiple dense prediction tasks.
arXiv Detail & Related papers (2022-07-08T12:50:13Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Weakly Supervised Keypoint Discovery [27.750244813890262]
We propose a method for keypoint discovery from a 2D image using image-level supervision.
Motivated by the weakly-supervised learning approach, our method exploits image-level supervision to identify discriminative parts.
Our approach achieves state-of-the-art performance for the task of keypoint estimation on the limited supervision scenarios.
arXiv Detail & Related papers (2021-09-28T01:26:53Z) - Pixel-Perfect Structure-from-Motion with Featuremetric Refinement [96.73365545609191]
We refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views.
This significantly improves the accuracy of camera poses and scene geometry for a wide range of keypoint detectors.
Our system easily scales to large image collections, enabling pixel-perfect crowd-sourced localization at scale.
arXiv Detail & Related papers (2021-08-18T17:58:55Z) - Pretrained equivariant features improve unsupervised landmark discovery [69.02115180674885]
We formulate a two-step unsupervised approach that overcomes this challenge by first learning powerful pixel-based features.
Our method produces state-of-the-art results in several challenging landmark detection datasets.
arXiv Detail & Related papers (2021-04-07T05:42:11Z) - Point-Set Anchors for Object Detection, Instance Segmentation and Pose
Estimation [85.96410825961966]
We argue that the image features extracted at a central point contain limited information for predicting distant keypoints or bounding box boundaries.
To facilitate inference, we propose to instead perform regression from a set of points placed at more advantageous positions.
We apply this proposed framework, called Point-Set Anchors, to object detection, instance segmentation, and human pose estimation.
arXiv Detail & Related papers (2020-07-06T15:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.