Weakly Supervised Keypoint Discovery
- URL: http://arxiv.org/abs/2109.13423v1
- Date: Tue, 28 Sep 2021 01:26:53 GMT
- Title: Weakly Supervised Keypoint Discovery
- Authors: Serim Ryou and Pietro Perona
- Abstract summary: We propose a method for keypoint discovery from a 2D image using image-level supervision.
Motivated by the weakly-supervised learning approach, our method exploits image-level supervision to identify discriminative parts.
Our approach achieves state-of-the-art performance for the task of keypoint estimation on the limited supervision scenarios.
- Score: 27.750244813890262
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a method for keypoint discovery from a 2D image
using image-level supervision. Recent works on unsupervised keypoint discovery
reliably discover keypoints of aligned instances. However, when the target
instances have high viewpoint or appearance variation, the discovered keypoints
do not match the semantic correspondences over different images. Our work aims
to discover keypoints even when the target instances have high viewpoint and
appearance variation by using image-level supervision. Motivated by the
weakly-supervised learning approach, our method exploits image-level
supervision to identify discriminative parts and infer the viewpoint of the
target instance. To discover diverse parts, we adopt a conditional image
generation approach using a pair of images with structural deformation.
Finally, we enforce a viewpoint-based equivariance constraint using the
keypoints from the image-level supervision to resolve the spatial correlation
problem that consistently appears in the images taken from various viewpoints.
Our approach achieves state-of-the-art performance for the task of keypoint
estimation on the limited supervision scenarios. Furthermore, the discovered
keypoints are directly applicable to downstream tasks without requiring any
keypoint labels.
Related papers
- Keypoint Promptable Re-Identification [76.31113049256375]
Occluded Person Re-Identification (ReID) is a metric learning task that involves matching occluded individuals based on their appearance.
We introduce Keypoint Promptable ReID (KPR), a novel formulation of the ReID problem that explicitly complements the input bounding box with a set of semantic keypoints.
We release custom keypoint labels for four popular ReID benchmarks. Experiments on person retrieval, but also on pose tracking, demonstrate that our method systematically surpasses previous state-of-the-art approaches.
arXiv Detail & Related papers (2024-07-25T15:20:58Z) - Open-Vocabulary Animal Keypoint Detection with Semantic-feature Matching [74.75284453828017]
Open-Vocabulary Keypoint Detection (OVKD) task is innovatively designed to use text prompts for identifying arbitrary keypoints across any species.
We have developed a novel framework named Open-Vocabulary Keypoint Detection with Semantic-feature Matching (KDSM)
This framework combines vision and language models, creating an interplay between language features and local keypoint visual features.
arXiv Detail & Related papers (2023-10-08T07:42:41Z) - Learning-based Relational Object Matching Across Views [63.63338392484501]
We propose a learning-based approach which combines local keypoints with novel object-level features for matching object detections between RGB images.
We train our object-level matching features based on appearance and inter-frame and cross-frame spatial relations between objects in an associative graph neural network.
arXiv Detail & Related papers (2023-05-03T19:36:51Z) - Pointly-Supervised Panoptic Segmentation [106.68888377104886]
We propose a new approach to applying point-level annotations for weakly-supervised panoptic segmentation.
Instead of the dense pixel-level labels used by fully supervised methods, point-level labels only provide a single point for each target as supervision.
We formulate the problem in an end-to-end framework by simultaneously generating panoptic pseudo-masks from point-level labels and learning from them.
arXiv Detail & Related papers (2022-10-25T12:03:51Z) - TUSK: Task-Agnostic Unsupervised Keypoints [21.777256048659165]
We propose a novel method to learn Task-agnostic, UnSupervised Keypoints (TUSK) which can deal with multiple instances.
Specifically, we encode semantics into the keypoints by teaching them to reconstruct images from a sparse set of keypoints and their descriptors.
This makes our approach amenable to a wider range of tasks than any previous unsupervised keypoint method.
arXiv Detail & Related papers (2022-06-16T21:56:17Z) - Point-Teaching: Weakly Semi-Supervised Object Detection with Point
Annotations [81.02347863372364]
We present Point-Teaching, a weakly semi-supervised object detection framework.
Specifically, we propose a Hungarian-based point matching method to generate pseudo labels for point annotated images.
We propose a simple-yet-effective data augmentation, termed point-guided copy-paste, to reduce the impact of the unmatched points.
arXiv Detail & Related papers (2022-06-01T07:04:38Z) - Self-Supervised Equivariant Learning for Oriented Keypoint Detection [35.94215211409985]
We introduce a self-supervised learning framework using rotation-equivariant CNNs to learn to detect robust oriented keypoints.
We propose a dense orientation alignment loss by an image pair generated by synthetic transformations for training a histogram-based orientation map.
Our method outperforms the previous methods on an image matching benchmark and a camera pose estimation benchmark.
arXiv Detail & Related papers (2022-04-19T02:26:07Z) - Attend to Who You Are: Supervising Self-Attention for Keypoint Detection
and Instance-Aware Association [40.78849763751773]
This paper presents a new method to solve keypoint detection and instance association by using Transformer.
We propose a novel approach of supervising self-attention for multi-person keypoint detection and instance association.
arXiv Detail & Related papers (2021-11-25T03:41:41Z) - End-to-End Learning of Keypoint Representations for Continuous Control
from Images [84.8536730437934]
We show that it is possible to learn efficient keypoint representations end-to-end, without the need for unsupervised pre-training, decoders, or additional losses.
Our proposed architecture consists of a differentiable keypoint extractor that feeds the coordinates directly to a soft actor-critic agent.
arXiv Detail & Related papers (2021-06-15T09:17:06Z) - LatentKeypointGAN: Controlling Images via Latent Keypoints [23.670795505376336]
We introduce LatentKeypointGAN, a two-stage GAN trained end-to-end on the classical GAN objective.
LatentKeypointGAN provides an interpretable latent space that can be used to re-arrange the generated images.
In addition, the explicit generation of keypoints and matching images enables a new, GAN-based method for unsupervised keypoint detection.
arXiv Detail & Related papers (2021-03-29T17:59:10Z) - Semi-supervised Keypoint Localization [12.37129078618206]
We propose to learn simultaneously keypoint heatmaps and pose invariant keypoint representations in a semi-supervised manner.
Our approach significantly outperforms previous methods on several benchmarks for human and animal body landmark localization.
arXiv Detail & Related papers (2021-01-20T06:23:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.