Rethinking Keypoint Representations: Modeling Keypoints and Poses as
Objects for Multi-Person Human Pose Estimation
- URL: http://arxiv.org/abs/2111.08557v2
- Date: Wed, 17 Nov 2021 12:09:14 GMT
- Title: Rethinking Keypoint Representations: Modeling Keypoints and Poses as
Objects for Multi-Person Human Pose Estimation
- Authors: William McNally, Kanav Vats, Alexander Wong, John McPhee
- Abstract summary: We propose a new heatmap-free keypoint estimation method in which individual keypoints and sets of spatially related keypoints (i.e., poses) are modeled as objects within a dense single-stage anchor-based detection framework.
In experiments, we observe that KAPAO is significantly faster and more accurate than previous methods, which suffer greatly from heatmap post-processing.
Our large model, KAPAO-L, achieves an AP of 70.6 on the Microsoft COCO Keypoints validation set without test-time augmentation.
- Score: 79.78017059539526
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In keypoint estimation tasks such as human pose estimation, heatmap-based
regression is the dominant approach despite possessing notable drawbacks:
heatmaps intrinsically suffer from quantization error and require excessive
computation to generate and post-process. Motivated to find a more efficient
solution, we propose a new heatmap-free keypoint estimation method in which
individual keypoints and sets of spatially related keypoints (i.e., poses) are
modeled as objects within a dense single-stage anchor-based detection
framework. Hence, we call our method KAPAO (pronounced "Ka-Pow!") for Keypoints
And Poses As Objects. We apply KAPAO to the problem of single-stage
multi-person human pose estimation by simultaneously detecting human pose
objects and keypoint objects and fusing the detections to exploit the strengths
of both object representations. In experiments, we observe that KAPAO is
significantly faster and more accurate than previous methods, which suffer
greatly from heatmap post-processing. Moreover, the accuracy-speed trade-off is
especially favourable in the practical setting when not using test-time
augmentation. Our large model, KAPAO-L, achieves an AP of 70.6 on the Microsoft
COCO Keypoints validation set without test-time augmentation while being 2.5x
faster than the next best single-stage model, whose accuracy is 4.0 AP less.
Furthermore, KAPAO excels in the presence of heavy occlusion. On the CrowdPose
test set, KAPAO-L achieves new state-of-the-art accuracy for a single-stage
method with an AP of 68.9.
Related papers
- SHaRPose: Sparse High-Resolution Representation for Human Pose
Estimation [39.936860590417346]
We propose a framework that only uses Sparse High-resolution Representations for human Pose estimation (SHaRPose)
Our model SHaRPose-Base achieves 77.4 AP (+0.5 AP) on the validation set and 76.7 AP (+0.5 AP) on the COCO test-dev set, and infers at a speed of $1.4times$ faster than ViTPose-Base.
arXiv Detail & Related papers (2023-12-17T16:29:16Z) - Hybrid model for Single-Stage Multi-Person Pose Estimation [3.592448408054345]
We propose a hybrid model for single-stage multi-person pose estimation, named HybridPose.
It is capable of not only detecting densely placed keypoints, but also filtering the non-existent keypoints in an image.
arXiv Detail & Related papers (2023-05-02T02:55:29Z) - PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching [51.142988196855484]
We propose PoseMatcher, an accurate model free one-shot object pose estimator.
We create a new training pipeline for object to image matching based on a three-view system.
To enable PoseMatcher to attend to distinct input modalities, an image and a pointcloud, we introduce IO-Layer.
arXiv Detail & Related papers (2023-04-03T21:14:59Z) - MDPose: Real-Time Multi-Person Pose Estimation via Mixture Density Model [27.849059115252008]
We propose a novel framework of single-stage instance-aware pose estimation by modeling the joint distribution of human keypoints.
Our MDPose achieves state-of-the-art performance by successfully learning the high-dimensional joint distribution of human keypoints.
arXiv Detail & Related papers (2023-02-17T08:29:33Z) - 2D Human Pose Estimation with Explicit Anatomical Keypoints Structure
Constraints [15.124606575017621]
We present a novel 2D human pose estimation method with explicit anatomical keypoints structure constraints.
Our proposed model can be plugged in the most existing bottom-up or top-down human pose estimation methods.
Our methods perform favorably against the most existing bottom-up and top-down human pose estimation methods.
arXiv Detail & Related papers (2022-12-05T11:01:43Z) - Bottom-Up 2D Pose Estimation via Dual Anatomical Centers for Small-Scale
Persons [75.86463396561744]
In multi-person 2D pose estimation, the bottom-up methods simultaneously predict poses for all persons.
Our method achieves 38.4% improvement on bounding box precision and 39.1% improvement on bounding box recall over the state of the art (SOTA)
For the human pose AP evaluation, we achieve a new SOTA (71.0 AP) on the COCO test-dev set with the single-scale testing.
arXiv Detail & Related papers (2022-08-25T10:09:10Z) - Pose for Everything: Towards Category-Agnostic Pose Estimation [93.07415325374761]
Category-Agnostic Pose Estimation (CAPE) aims to create a pose estimation model capable of detecting the pose of any class of object given only a few samples with keypoint definition.
A transformer-based Keypoint Interaction Module (KIM) is proposed to capture both the interactions among different keypoints and the relationship between the support and query images.
We also introduce Multi-category Pose (MP-100) dataset, which is a 2D pose dataset of 100 object categories containing over 20K instances and is well-designed for developing CAPE algorithms.
arXiv Detail & Related papers (2022-07-21T09:40:54Z) - 6D Object Pose Estimation using Keypoints and Part Affinity Fields [24.126513851779936]
The task of 6D object pose estimation from RGB images is an important requirement for autonomous service robots to be able to interact with the real world.
We present a two-step pipeline for estimating the 6 DoF translation and orientation of known objects.
arXiv Detail & Related papers (2021-07-05T14:41:19Z) - Point-Set Anchors for Object Detection, Instance Segmentation and Pose
Estimation [85.96410825961966]
We argue that the image features extracted at a central point contain limited information for predicting distant keypoints or bounding box boundaries.
To facilitate inference, we propose to instead perform regression from a set of points placed at more advantageous positions.
We apply this proposed framework, called Point-Set Anchors, to object detection, instance segmentation, and human pose estimation.
arXiv Detail & Related papers (2020-07-06T15:59:56Z) - Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive
Keypoint Estimates [76.51095823248104]
We present several schemes that are rarely or unthoroughly studied before for improving keypoint detection and grouping (keypoint regression) performance.
First, we exploit the keypoint heatmaps for pixel-wise keypoint regression instead of separating them for improving keypoint regression.
Second, we adopt a pixel-wise spatial transformer network to learn adaptive representations for handling the scale and orientation variance.
Third, we present a joint shape and heatvalue scoring scheme to promote the estimated poses that are more likely to be true poses.
arXiv Detail & Related papers (2020-06-28T01:14:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.