Looking Beyond Corners: Contrastive Learning of Visual Representations
for Keypoint Detection and Description Extraction
- URL: http://arxiv.org/abs/2112.12002v1
- Date: Wed, 22 Dec 2021 16:27:11 GMT
- Title: Looking Beyond Corners: Contrastive Learning of Visual Representations
for Keypoint Detection and Description Extraction
- Authors: Henrique Siqueira, Patrick Ruhkamp, Ibrahim Halfaoui, Markus Karmann,
Onay Urfalioglu
- Abstract summary: Learnable keypoint detectors and descriptors are beginning to outperform classical hand-crafted feature extraction methods.
Recent studies on self-supervised learning of visual representations have driven the increasing performance of learnable models based on deep networks.
We propose the Correspondence Network (CorrNet) that learns to detect repeatable keypoints and to extract discriminative descriptions.
- Score: 1.5749416770494706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learnable keypoint detectors and descriptors are beginning to outperform
classical hand-crafted feature extraction methods. Recent studies on
self-supervised learning of visual representations have driven the increasing
performance of learnable models based on deep networks. By leveraging
traditional data augmentations and homography transformations, these networks
learn to detect corners under adverse conditions such as extreme illumination
changes. However, their generalization capabilities are limited to corner-like
features detected a priori by classical methods or synthetically generated
data.
In this paper, we propose the Correspondence Network (CorrNet) that learns to
detect repeatable keypoints and to extract discriminative descriptions via
unsupervised contrastive learning under spatial constraints. Our experiments
show that CorrNet is not only able to detect low-level features such as
corners, but also high-level features that represent similar objects present in
a pair of input images through our proposed joint guided backpropagation of
their latent space. Our approach obtains competitive results under viewpoint
changes and achieves state-of-the-art performance under illumination changes.
Related papers
- Efficient Visualization of Neural Networks with Generative Models and Adversarial Perturbations [0.0]
This paper presents a novel approach for deep visualization via a generative network, offering an improvement over existing methods.
Our model simplifies the architecture by reducing the number of networks used, requiring only a generator and a discriminator.
Our model requires less prior training knowledge and uses a non-adversarial training process, where the discriminator acts as a guide.
arXiv Detail & Related papers (2024-09-20T14:59:25Z) - Learning Object-Centric Representation via Reverse Hierarchy Guidance [73.05170419085796]
Object-Centric Learning (OCL) seeks to enable Neural Networks to identify individual objects in visual scenes.
RHGNet introduces a top-down pathway that works in different ways in the training and inference processes.
Our model achieves SOTA performance on several commonly used datasets.
arXiv Detail & Related papers (2024-05-17T07:48:27Z) - Automatic Discovery of Visual Circuits [66.99553804855931]
We explore scalable methods for extracting the subgraph of a vision model's computational graph that underlies recognition of a specific visual concept.
We find that our approach extracts circuits that causally affect model output, and that editing these circuits can defend large pretrained models from adversarial attacks.
arXiv Detail & Related papers (2024-04-22T17:00:57Z) - High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning [54.86882315023791]
We propose an innovative approach called High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning (HDAFL)
HDAFL utilizes multiple convolutional kernels to automatically learn discriminative regions highly correlated with attributes in images.
We also introduce a Transformer-based attribute discrimination encoder to enhance the discriminative capability among attributes.
arXiv Detail & Related papers (2024-04-07T13:17:47Z) - Enhancing Deformable Local Features by Jointly Learning to Detect and
Describe Keypoints [8.390939268280235]
Local feature extraction is a standard approach in computer vision for tackling important tasks such as image matching and retrieval.
We propose DALF, a novel deformation-aware network for jointly detecting and describing keypoints.
Our approach also enhances the performance of two real-world applications: deformable object retrieval and non-rigid 3D surface registration.
arXiv Detail & Related papers (2023-04-02T18:01:51Z) - Active Visual Exploration Based on Attention-Map Entropy [13.064016215754163]
We introduce a new technique called Attention-Map Entropy (AME) to determine the most informative observations.
AME does not require additional loss components, which simplifies the training.
We show that such simplified training significantly improves the performance of reconstruction, segmentation and classification on publicly available datasets.
arXiv Detail & Related papers (2023-03-11T17:14:30Z) - Learning Common Rationale to Improve Self-Supervised Representation for
Fine-Grained Visual Recognition Problems [61.11799513362704]
We propose learning an additional screening mechanism to identify discriminative clues commonly seen across instances and classes.
We show that a common rationale detector can be learned by simply exploiting the GradCAM induced from the SSL objective.
arXiv Detail & Related papers (2023-03-03T02:07:40Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - RoRD: Rotation-Robust Descriptors and Orthographic Views for Local
Feature Matching [32.10261486751993]
We present a novel framework that combines learning of invariant descriptors through data augmentation and viewpoint projection.
We evaluate the effectiveness of the proposed approach on key tasks including pose estimation and visual place recognition.
arXiv Detail & Related papers (2021-03-15T17:40:25Z) - Revisiting Edge Detection in Convolutional Neural Networks [3.5281112495479245]
We show that edges cannot be represented properly in the first convolutional layer of a neural network.
We propose edge-detection units and show that they reduce performance loss and generate qualitatively different representations.
arXiv Detail & Related papers (2020-12-25T13:53:04Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.