Recognition of Freely Selected Keypoints on Human Limbs
- URL: http://arxiv.org/abs/2204.06326v1
- Date: Wed, 13 Apr 2022 11:58:28 GMT
- Title: Recognition of Freely Selected Keypoints on Human Limbs
- Authors: Katja Ludwig, Daniel Kienzle, Rainer Lienhart
- Abstract summary: We use the Vision Transformer architecture to extend the capability of the model to detect arbitrary keypoints on the limbs of persons.
Our approaches achieve similar results to TokenPose on the fixed keypoints and are capable of detecting arbitrary keypoints on the limbs.
- Score: 18.176606453818557
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nearly all Human Pose Estimation (HPE) datasets consist of a fixed set of
keypoints. Standard HPE models trained on such datasets can only detect these
keypoints. If more points are desired, they have to be manually annotated and
the model needs to be retrained. Our approach leverages the Vision Transformer
architecture to extend the capability of the model to detect arbitrary
keypoints on the limbs of persons. We propose two different approaches to
encode the desired keypoints. (1) Each keypoint is defined by its position
along the line between the two enclosing keypoints from the fixed set and its
relative distance between this line and the edge of the limb. (2) Keypoints are
defined as coordinates on a norm pose. Both approaches are based on the
TokenPose architecture, while the keypoint tokens that correspond to the fixed
keypoints are replaced with our novel module. Experiments show that our
approaches achieve similar results to TokenPose on the fixed keypoints and are
capable of detecting arbitrary keypoints on the limbs.
Related papers
- GMM-IKRS: Gaussian Mixture Models for Interpretable Keypoint Refinement and Scoring [9.322937309882022]
Keypoints come with a score permitting to rank them according to their quality.
While learned keypoints often exhibit better properties than handcrafted ones, their scores are not easily interpretable.
We propose a framework that can refine, and at the same time characterize with an interpretable score, the keypoints extracted by any method.
arXiv Detail & Related papers (2024-08-30T09:39:59Z) - Meta-Point Learning and Refining for Category-Agnostic Pose Estimation [46.98479393474727]
Category-agnostic pose estimation (CAPE) aims to predict keypoints for arbitrary classes given a few support images annotated with keypoints.
We propose a novel framework for CAPE based on such potential keypoints (named as meta-points)
arXiv Detail & Related papers (2024-03-20T14:54:33Z) - DeDoDe: Detect, Don't Describe -- Describe, Don't Detect for Local
Feature Matching [14.837075102089]
Keypoint detection is a pivotal step in 3D reconstruction, whereby sets of (up to) K points are detected in each view of a scene.
Previous learning-based methods typically learn descriptors with keypoints, and treat the keypoint detection as a binary classification task on mutual nearest neighbours.
In this work, we learn keypoints directly from 3D consistency. To this end, we derive a semi-supervised two-view detection objective to expand this set to a desired number of detections.
Results show that our approach, DeDoDe, achieves significant gains on multiple geometry benchmarks.
arXiv Detail & Related papers (2023-08-16T16:37:02Z) - Pose for Everything: Towards Category-Agnostic Pose Estimation [93.07415325374761]
Category-Agnostic Pose Estimation (CAPE) aims to create a pose estimation model capable of detecting the pose of any class of object given only a few samples with keypoint definition.
A transformer-based Keypoint Interaction Module (KIM) is proposed to capture both the interactions among different keypoints and the relationship between the support and query images.
We also introduce Multi-category Pose (MP-100) dataset, which is a 2D pose dataset of 100 object categories containing over 20K instances and is well-designed for developing CAPE algorithms.
arXiv Detail & Related papers (2022-07-21T09:40:54Z) - End-to-End Learning of Keypoint Representations for Continuous Control
from Images [84.8536730437934]
We show that it is possible to learn efficient keypoint representations end-to-end, without the need for unsupervised pre-training, decoders, or additional losses.
Our proposed architecture consists of a differentiable keypoint extractor that feeds the coordinates directly to a soft actor-critic agent.
arXiv Detail & Related papers (2021-06-15T09:17:06Z) - Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression [81.05772887221333]
We study the dense keypoint regression framework that is previously inferior to the keypoint detection and grouping framework.
We present a simple yet effective approach, named disentangled keypoint regression (DEKR)
We empirically show that the proposed direct regression method outperforms keypoint detection and grouping methods.
arXiv Detail & Related papers (2021-04-06T05:54:46Z) - UKPGAN: A General Self-Supervised Keypoint Detector [43.35270822722044]
UKPGAN is a general self-supervised 3D keypoint detector.
Our keypoints align well with human annotated keypoint labels.
Our model is stable under both rigid and non-rigid transformations.
arXiv Detail & Related papers (2020-11-24T09:08:21Z) - Keypoint Autoencoders: Learning Interest Points of Semantics [4.551313396927381]
We propose Keypoint Autoencoder, an unsupervised learning method for detecting keypoints.
We encourage selecting sparse semantic keypoints by enforcing the reconstruction from keypoints to the original point cloud.
A downstream task of classifying shape with sparse keypoints is conducted to demonstrate the distinctiveness of our selected keypoints.
arXiv Detail & Related papers (2020-08-11T03:43:18Z) - Point-Set Anchors for Object Detection, Instance Segmentation and Pose
Estimation [85.96410825961966]
We argue that the image features extracted at a central point contain limited information for predicting distant keypoints or bounding box boundaries.
To facilitate inference, we propose to instead perform regression from a set of points placed at more advantageous positions.
We apply this proposed framework, called Point-Set Anchors, to object detection, instance segmentation, and human pose estimation.
arXiv Detail & Related papers (2020-07-06T15:59:56Z) - Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive
Keypoint Estimates [76.51095823248104]
We present several schemes that are rarely or unthoroughly studied before for improving keypoint detection and grouping (keypoint regression) performance.
First, we exploit the keypoint heatmaps for pixel-wise keypoint regression instead of separating them for improving keypoint regression.
Second, we adopt a pixel-wise spatial transformer network to learn adaptive representations for handling the scale and orientation variance.
Third, we present a joint shape and heatvalue scoring scheme to promote the estimated poses that are more likely to be true poses.
arXiv Detail & Related papers (2020-06-28T01:14:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.