LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity
- URL: http://arxiv.org/abs/2204.02958v1
- Date: Wed, 6 Apr 2022 17:48:18 GMT
- Title: LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity
- Authors: Tejan Karmali, Abhinav Atrishi, Sai Sree Harsha, Susmit Agrawal, Varun
Jampani, R. Venkatesh Babu
- Abstract summary: Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
- Score: 49.84167231111667
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we introduce LEAD, an approach to discover landmarks from an
unannotated collection of category-specific images. Existing works in
self-supervised landmark detection are based on learning dense (pixel-level)
feature representations from an image, which are further used to learn
landmarks in a semi-supervised manner. While there have been advances in
self-supervised learning of image features for instance-level tasks like
classification, these methods do not ensure dense equivariant representations.
The property of equivariance is of interest for dense prediction tasks like
landmark estimation. In this work, we introduce an approach to enhance the
learning of dense equivariant representations in a self-supervised fashion. We
follow a two-stage training approach: first, we train a network using the BYOL
objective which operates at an instance level. The correspondences obtained
through this network are further used to train a dense and compact
representation of the image using a lightweight network. We show that having
such a prior in the feature extractor helps in landmark detection, even under
drastically limited number of annotations while also improving generalization
across scale variations.
Related papers
- Location-Aware Self-Supervised Transformers [74.76585889813207]
We propose to pretrain networks for semantic segmentation by predicting the relative location of image parts.
We control the difficulty of the task by masking a subset of the reference patch features visible to those of the query.
Our experiments show that this location-aware pretraining leads to representations that transfer competitively to several challenging semantic segmentation benchmarks.
arXiv Detail & Related papers (2022-12-05T16:24:29Z) - Self-supervised Contrastive Learning for Cross-domain Hyperspectral
Image Representation [26.610588734000316]
This paper introduces a self-supervised learning framework suitable for hyperspectral images that are inherently challenging to annotate.
The proposed framework architecture leverages cross-domain CNN, allowing for learning representations from different hyperspectral images.
The experimental results demonstrate the advantage of the proposed self-supervised representation over models trained from scratch or other transfer learning methods.
arXiv Detail & Related papers (2022-02-08T16:16:45Z) - Instance Localization for Self-supervised Detection Pretraining [68.24102560821623]
We propose a new self-supervised pretext task, called instance localization.
We show that integration of bounding boxes into pretraining promotes better task alignment and architecture alignment for transfer learning.
Experimental results demonstrate that our approach yields state-of-the-art transfer learning results for object detection.
arXiv Detail & Related papers (2021-02-16T17:58:57Z) - Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation [128.03739769844736]
Two neural co-attentions are incorporated into the classifier to capture cross-image semantic similarities and differences.
In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference.
Our algorithm sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.
arXiv Detail & Related papers (2020-07-03T21:53:46Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z) - Learning Representations by Predicting Bags of Visual Words [55.332200948110895]
Self-supervised representation learning targets to learn convnet-based image representations from unlabeled data.
Inspired by the success of NLP methods in this area, in this work we propose a self-supervised approach based on spatially dense image descriptions.
arXiv Detail & Related papers (2020-02-27T16:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.