TUSK: Task-Agnostic Unsupervised Keypoints
- URL: http://arxiv.org/abs/2206.08460v1
- Date: Thu, 16 Jun 2022 21:56:17 GMT
- Title: TUSK: Task-Agnostic Unsupervised Keypoints
- Authors: Yuhe Jin, Weiwei Sun, Jan Hosang, Eduard Trulls, Kwang Moo Yi
- Abstract summary: We propose a novel method to learn Task-agnostic, UnSupervised Keypoints (TUSK) which can deal with multiple instances.
Specifically, we encode semantics into the keypoints by teaching them to reconstruct images from a sparse set of keypoints and their descriptors.
This makes our approach amenable to a wider range of tasks than any previous unsupervised keypoint method.
- Score: 21.777256048659165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing unsupervised methods for keypoint learning rely heavily on the
assumption that a specific keypoint type (e.g. elbow, digit, abstract geometric
shape) appears only once in an image. This greatly limits their applicability,
as each instance must be isolated before applying the method-an issue that is
never discussed or evaluated. We thus propose a novel method to learn
Task-agnostic, UnSupervised Keypoints (TUSK) which can deal with multiple
instances. To achieve this, instead of the commonly-used strategy of detecting
multiple heatmaps, each dedicated to a specific keypoint type, we use a single
heatmap for detection, and enable unsupervised learning of keypoint types
through clustering. Specifically, we encode semantics into the keypoints by
teaching them to reconstruct images from a sparse set of keypoints and their
descriptors, where the descriptors are forced to form distinct clusters in
feature space around learned prototypes. This makes our approach amenable to a
wider range of tasks than any previous unsupervised keypoint method: we show
experiments on multiple-instance detection and classification, object
discovery, and landmark detection-all unsupervised-with performance on par with
the state of the art, while also being able to deal with multiple instances.
Related papers
- Keypoint Promptable Re-Identification [76.31113049256375]
Occluded Person Re-Identification (ReID) is a metric learning task that involves matching occluded individuals based on their appearance.
We introduce Keypoint Promptable ReID (KPR), a novel formulation of the ReID problem that explicitly complements the input bounding box with a set of semantic keypoints.
We release custom keypoint labels for four popular ReID benchmarks. Experiments on person retrieval, but also on pose tracking, demonstrate that our method systematically surpasses previous state-of-the-art approaches.
arXiv Detail & Related papers (2024-07-25T15:20:58Z) - Open-Vocabulary Animal Keypoint Detection with Semantic-feature Matching [74.75284453828017]
Open-Vocabulary Keypoint Detection (OVKD) task is innovatively designed to use text prompts for identifying arbitrary keypoints across any species.
We have developed a novel framework named Open-Vocabulary Keypoint Detection with Semantic-feature Matching (KDSM)
This framework combines vision and language models, creating an interplay between language features and local keypoint visual features.
arXiv Detail & Related papers (2023-10-08T07:42:41Z) - DeDoDe: Detect, Don't Describe -- Describe, Don't Detect for Local
Feature Matching [14.837075102089]
Keypoint detection is a pivotal step in 3D reconstruction, whereby sets of (up to) K points are detected in each view of a scene.
Previous learning-based methods typically learn descriptors with keypoints, and treat the keypoint detection as a binary classification task on mutual nearest neighbours.
In this work, we learn keypoints directly from 3D consistency. To this end, we derive a semi-supervised two-view detection objective to expand this set to a desired number of detections.
Results show that our approach, DeDoDe, achieves significant gains on multiple geometry benchmarks.
arXiv Detail & Related papers (2023-08-16T16:37:02Z) - Point-Teaching: Weakly Semi-Supervised Object Detection with Point
Annotations [81.02347863372364]
We present Point-Teaching, a weakly semi-supervised object detection framework.
Specifically, we propose a Hungarian-based point matching method to generate pseudo labels for point annotated images.
We propose a simple-yet-effective data augmentation, termed point-guided copy-paste, to reduce the impact of the unmatched points.
arXiv Detail & Related papers (2022-06-01T07:04:38Z) - Few-shot Keypoint Detection with Uncertainty Learning for Unseen Species [28.307200505494126]
We propose a versatile Few-shot Keypoint Detection (FSKD) pipeline, which can detect a varying number of keypoints of different kinds.
Our FSKD involves main and auxiliary keypoint representation learning, similarity learning, and keypoint localization.
We show the effectiveness of our FSKD on (i) novel keypoint detection for unseen species, and (ii) few-shot Fine-Grained Visual Recognition (FGVR) and (iii) Semantic Alignment (SA) downstream tasks.
arXiv Detail & Related papers (2021-12-12T08:39:47Z) - Attend to Who You Are: Supervising Self-Attention for Keypoint Detection
and Instance-Aware Association [40.78849763751773]
This paper presents a new method to solve keypoint detection and instance association by using Transformer.
We propose a novel approach of supervising self-attention for multi-person keypoint detection and instance association.
arXiv Detail & Related papers (2021-11-25T03:41:41Z) - Learning to Detect Instance-level Salient Objects Using Complementary
Image Labels [55.049347205603304]
We present the first weakly-supervised approach to the salient instance detection problem.
We propose a novel weakly-supervised network with three branches: a Saliency Detection Branch leveraging class consistency information to locate candidate objects; a Boundary Detection Branch exploiting class discrepancy information to delineate object boundaries; and a Centroid Detection Branch using subitizing information to detect salient instance centroids.
arXiv Detail & Related papers (2021-11-19T10:15:22Z) - Weakly Supervised Keypoint Discovery [27.750244813890262]
We propose a method for keypoint discovery from a 2D image using image-level supervision.
Motivated by the weakly-supervised learning approach, our method exploits image-level supervision to identify discriminative parts.
Our approach achieves state-of-the-art performance for the task of keypoint estimation on the limited supervision scenarios.
arXiv Detail & Related papers (2021-09-28T01:26:53Z) - Keypoint Autoencoders: Learning Interest Points of Semantics [4.551313396927381]
We propose Keypoint Autoencoder, an unsupervised learning method for detecting keypoints.
We encourage selecting sparse semantic keypoints by enforcing the reconstruction from keypoints to the original point cloud.
A downstream task of classifying shape with sparse keypoints is conducted to demonstrate the distinctiveness of our selected keypoints.
arXiv Detail & Related papers (2020-08-11T03:43:18Z) - Differentiable Hierarchical Graph Grouping for Multi-Person Pose
Estimation [95.72606536493548]
Multi-person pose estimation is challenging because it localizes body keypoints for multiple persons simultaneously.
We propose a novel differentiable Hierarchical Graph Grouping (HGG) method to learn the graph grouping in bottom-up multi-person pose estimation task.
arXiv Detail & Related papers (2020-07-23T08:46:22Z) - A Few-Shot Sequential Approach for Object Counting [63.82757025821265]
We introduce a class attention mechanism that sequentially attends to objects in the image and extracts their relevant features.
The proposed technique is trained on point-level annotations and uses a novel loss function that disentangles class-dependent and class-agnostic aspects of the model.
We present our results on a variety of object-counting/detection datasets, including FSOD and MS COCO.
arXiv Detail & Related papers (2020-07-03T18:23:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.