Digging Into Self-Supervised Learning of Feature Descriptors
- URL: http://arxiv.org/abs/2110.04773v1
- Date: Sun, 10 Oct 2021 12:22:44 GMT
- Title: Digging Into Self-Supervised Learning of Feature Descriptors
- Authors: Iaroslav Melekhov and Zakaria Laskar and Xiaotian Li and Shuzhe Wang
and Juho Kannala
- Abstract summary: We propose a set of improvements that combined lead to powerful feature descriptors.
We show that increasing the search space from in-pair to in-batch for hard negative mining brings consistent improvement.
We demonstrate that a combination of synthetic homography transformation, color augmentation, and photorealistic image stylization produces useful representations.
- Score: 14.47046413243358
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fully-supervised CNN-based approaches for learning local image descriptors
have shown remarkable results in a wide range of geometric tasks. However, most
of them require per-pixel ground-truth keypoint correspondence data which is
difficult to acquire at scale. To address this challenge, recent weakly- and
self-supervised methods can learn feature descriptors from relative camera
poses or using only synthetic rigid transformations such as homographies. In
this work, we focus on understanding the limitations of existing
self-supervised approaches and propose a set of improvements that combined lead
to powerful feature descriptors. We show that increasing the search space from
in-pair to in-batch for hard negative mining brings consistent improvement. To
enhance the discriminativeness of feature descriptors, we propose a
coarse-to-fine method for mining local hard negatives from a wider search space
by using global visual image descriptors. We demonstrate that a combination of
synthetic homography transformation, color augmentation, and photorealistic
image stylization produces useful representations that are viewpoint and
illumination invariant. The feature descriptors learned by the proposed
approach perform competitively and surpass their fully- and weakly-supervised
counterparts on various geometric benchmarks such as image-based localization,
sparse feature matching, and image retrieval.
Related papers
- OsmLocator: locating overlapping scatter marks with a non-training
generative perspective [48.50108853199417]
Locating overlapping marks faces many difficulties such as no texture, less contextual information, hallow shape and tiny size.
Here, we formulate it as a optimization problem on clustering-based re-visualization from a non-training generative perspective.
We especially built a dataset named 2023 containing hundreds of scatter images with different markers and various levels of overlapping severity, and tested the proposed method and compared it to existing methods.
arXiv Detail & Related papers (2023-12-18T12:39:48Z) - Patch-Wise Self-Supervised Visual Representation Learning: A Fine-Grained Approach [4.9204263448542465]
This study introduces an innovative, fine-grained dimension by integrating patch-level discrimination into self-supervised visual representation learning.
We employ a distinctive photometric patch-level augmentation, where each patch is individually augmented, independent from other patches within the same view.
We present a simple yet effective patch-matching algorithm to find the corresponding patches across the augmented views.
arXiv Detail & Related papers (2023-10-28T09:35:30Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - Towards Effective Image Manipulation Detection with Proposal Contrastive
Learning [61.5469708038966]
We propose Proposal Contrastive Learning (PCL) for effective image manipulation detection.
Our PCL consists of a two-stream architecture by extracting two types of global features from RGB and noise views respectively.
Our PCL can be easily adapted to unlabeled data in practice, which can reduce manual labeling costs and promote more generalizable features.
arXiv Detail & Related papers (2022-10-16T13:30:13Z) - Adversarial Learning of Hard Positives for Place Recognition [5.142439069733352]
We propose an adversarial method to guide the creation of hard positives for training image retrieval networks.
Our method achieves state-of-the-art recalls on the Pitts250 and Tokyo 24/7 benchmarks.
arXiv Detail & Related papers (2022-05-08T13:54:03Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Region-level Active Learning for Cluttered Scenes [60.93811392293329]
We introduce a new strategy that subsumes previous Image-level and Object-level approaches into a generalized, Region-level approach.
We show that this approach significantly decreases labeling effort and improves rare object search on realistic data with inherent class-imbalance and cluttered scenes.
arXiv Detail & Related papers (2021-08-20T14:02:38Z) - Deep Transformation-Invariant Clustering [24.23117820167443]
We present an approach that does not rely on abstract features but instead learns to predict image transformations.
This learning process naturally fits in the gradient-based training of K-means and Gaussian mixture model.
We demonstrate that our novel approach yields competitive and highly promising results on standard image clustering benchmarks.
arXiv Detail & Related papers (2020-06-19T13:43:08Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.