Few-shot Keypoint Detection with Uncertainty Learning for Unseen Species
- URL: http://arxiv.org/abs/2112.06183v1
- Date: Sun, 12 Dec 2021 08:39:47 GMT
- Title: Few-shot Keypoint Detection with Uncertainty Learning for Unseen Species
- Authors: Changsheng Lu, Piotr Koniusz
- Abstract summary: We propose a versatile Few-shot Keypoint Detection (FSKD) pipeline, which can detect a varying number of keypoints of different kinds.
Our FSKD involves main and auxiliary keypoint representation learning, similarity learning, and keypoint localization.
We show the effectiveness of our FSKD on (i) novel keypoint detection for unseen species, and (ii) few-shot Fine-Grained Visual Recognition (FGVR) and (iii) Semantic Alignment (SA) downstream tasks.
- Score: 28.307200505494126
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Current non-rigid object keypoint detectors perform well on a chosen kind of
species and body parts, and require a large amount of labelled keypoints for
training. Moreover, their heatmaps, tailored to specific body parts, cannot
recognize novel keypoints (keypoints not labelled for training) on unseen
species. We raise an interesting yet challenging question: how to detect both
base (annotated for training) and novel keypoints for unseen species given a
few annotated samples? Thus, we propose a versatile Few-shot Keypoint Detection
(FSKD) pipeline, which can detect a varying number of keypoints of different
kinds. Our FSKD provides the uncertainty estimation of predicted keypoints.
Specifically, FSKD involves main and auxiliary keypoint representation
learning, similarity learning, and keypoint localization with uncertainty
modeling to tackle the localization noise. Moreover, we model the uncertainty
across groups of keypoints by multivariate Gaussian distribution to exploit
implicit correlations between neighboring keypoints. We show the effectiveness
of our FSKD on (i) novel keypoint detection for unseen species, and (ii)
few-shot Fine-Grained Visual Recognition (FGVR) and (iii) Semantic Alignment
(SA) downstream tasks. For FGVR, detected keypoints improve the classification
accuracy. For SA, we showcase a novel thin-plate-spline warping that uses
estimated keypoint uncertainty under imperfect keypoint corespondences.
Related papers
- GMM-IKRS: Gaussian Mixture Models for Interpretable Keypoint Refinement and Scoring [9.322937309882022]
Keypoints come with a score permitting to rank them according to their quality.
While learned keypoints often exhibit better properties than handcrafted ones, their scores are not easily interpretable.
We propose a framework that can refine, and at the same time characterize with an interpretable score, the keypoints extracted by any method.
arXiv Detail & Related papers (2024-08-30T09:39:59Z) - Keypoint Promptable Re-Identification [76.31113049256375]
Occluded Person Re-Identification (ReID) is a metric learning task that involves matching occluded individuals based on their appearance.
We introduce Keypoint Promptable ReID (KPR), a novel formulation of the ReID problem that explicitly complements the input bounding box with a set of semantic keypoints.
We release custom keypoint labels for four popular ReID benchmarks. Experiments on person retrieval, but also on pose tracking, demonstrate that our method systematically surpasses previous state-of-the-art approaches.
arXiv Detail & Related papers (2024-07-25T15:20:58Z) - Meta-Point Learning and Refining for Category-Agnostic Pose Estimation [46.98479393474727]
Category-agnostic pose estimation (CAPE) aims to predict keypoints for arbitrary classes given a few support images annotated with keypoints.
We propose a novel framework for CAPE based on such potential keypoints (named as meta-points)
arXiv Detail & Related papers (2024-03-20T14:54:33Z) - Open-Vocabulary Animal Keypoint Detection with Semantic-feature Matching [74.75284453828017]
Open-Vocabulary Keypoint Detection (OVKD) task is innovatively designed to use text prompts for identifying arbitrary keypoints across any species.
We have developed a novel framework named Open-Vocabulary Keypoint Detection with Semantic-feature Matching (KDSM)
This framework combines vision and language models, creating an interplay between language features and local keypoint visual features.
arXiv Detail & Related papers (2023-10-08T07:42:41Z) - DeDoDe: Detect, Don't Describe -- Describe, Don't Detect for Local
Feature Matching [14.837075102089]
Keypoint detection is a pivotal step in 3D reconstruction, whereby sets of (up to) K points are detected in each view of a scene.
Previous learning-based methods typically learn descriptors with keypoints, and treat the keypoint detection as a binary classification task on mutual nearest neighbours.
In this work, we learn keypoints directly from 3D consistency. To this end, we derive a semi-supervised two-view detection objective to expand this set to a desired number of detections.
Results show that our approach, DeDoDe, achieves significant gains on multiple geometry benchmarks.
arXiv Detail & Related papers (2023-08-16T16:37:02Z) - SC3K: Self-supervised and Coherent 3D Keypoints Estimation from Rotated,
Noisy, and Decimated Point Cloud Data [17.471342278936365]
We propose a new method to infer keypoints from arbitrary object categories in practical scenarios where point cloud data (PCD) are noisy, down-sampled and arbitrarily rotated.
We achieve these desiderata by proposing a new self-supervised training strategy for keypoints estimation.
We compare the keypoints estimated by the proposed approach with those of the state-of-the-art unsupervised approaches.
arXiv Detail & Related papers (2023-08-10T08:10:01Z) - Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural
Network [52.29330138835208]
Accurately matching local features between a pair of images is a challenging computer vision task.
Previous studies typically use attention based graph neural networks (GNNs) with fully-connected graphs over keypoints within/across images.
We propose MaKeGNN, a sparse attention-based GNN architecture which bypasses non-repeatable keypoints and leverages matchable ones to guide message passing.
arXiv Detail & Related papers (2023-07-04T02:50:44Z) - Shi-NeSS: Detecting Good and Stable Keypoints with a Neural Stability
Score [73.91231776658375]
We build on the principled and localized keypoints provided by the Shi detector and perform their selection using the keypoint stability score regressed by the neural network.
We evaluate Shi-NeSS on HPatches, ScanNet, MegaDepth and IMC-PT, demonstrating state-of-the-art performance and good generalization on downstream tasks.
arXiv Detail & Related papers (2023-07-03T14:50:14Z) - Rethinking Keypoint Representations: Modeling Keypoints and Poses as
Objects for Multi-Person Human Pose Estimation [79.78017059539526]
We propose a new heatmap-free keypoint estimation method in which individual keypoints and sets of spatially related keypoints (i.e., poses) are modeled as objects within a dense single-stage anchor-based detection framework.
In experiments, we observe that KAPAO is significantly faster and more accurate than previous methods, which suffer greatly from heatmap post-processing.
Our large model, KAPAO-L, achieves an AP of 70.6 on the Microsoft COCO Keypoints validation set without test-time augmentation.
arXiv Detail & Related papers (2021-11-16T15:36:44Z) - Keypoint Autoencoders: Learning Interest Points of Semantics [4.551313396927381]
We propose Keypoint Autoencoder, an unsupervised learning method for detecting keypoints.
We encourage selecting sparse semantic keypoints by enforcing the reconstruction from keypoints to the original point cloud.
A downstream task of classifying shape with sparse keypoints is conducted to demonstrate the distinctiveness of our selected keypoints.
arXiv Detail & Related papers (2020-08-11T03:43:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.