Hybrid model for Single-Stage Multi-Person Pose Estimation
- URL: http://arxiv.org/abs/2305.01167v2
- Date: Mon, 19 Jun 2023 00:58:12 GMT
- Title: Hybrid model for Single-Stage Multi-Person Pose Estimation
- Authors: Jonghyun Kim, Bosang Kim, Hyotae Lee, Jungpyo Kim, Wonhyeok Im,
Lanying Jin, Dowoo Kwon, and Jungho Lee
- Abstract summary: We propose a hybrid model for single-stage multi-person pose estimation, named HybridPose.
It is capable of not only detecting densely placed keypoints, but also filtering the non-existent keypoints in an image.
- Score: 3.592448408054345
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In general, human pose estimation methods are categorized into two approaches
according to their architectures: regression (i.e., heatmap-free) and
heatmap-based methods. The former one directly estimates precise coordinates of
each keypoint using convolutional and fully-connected layers. Although this
approach is able to detect overlapped and dense keypoints, unexpected results
can be obtained by non-existent keypoints in a scene. On the other hand, the
latter one is able to filter the non-existent ones out by utilizing predicted
heatmaps for each keypoint. Nevertheless, it suffers from quantization error
when obtaining the keypoint coordinates from its heatmaps. In addition, unlike
the regression one, it is difficult to distinguish densely placed keypoints in
an image. To this end, we propose a hybrid model for single-stage multi-person
pose estimation, named HybridPose, which mutually overcomes each drawback of
both approaches by maximizing their strengths. Furthermore, we introduce
self-correlation loss to inject spatial dependencies between keypoint
coordinates and their visibility. Therefore, HybridPose is capable of not only
detecting densely placed keypoints, but also filtering the non-existent
keypoints in an image. Experimental results demonstrate that proposed
HybridPose exhibits the keypoints visibility without performance degradation in
terms of the pose estimation accuracy.
Related papers
- GMM-IKRS: Gaussian Mixture Models for Interpretable Keypoint Refinement and Scoring [9.322937309882022]
Keypoints come with a score permitting to rank them according to their quality.
While learned keypoints often exhibit better properties than handcrafted ones, their scores are not easily interpretable.
We propose a framework that can refine, and at the same time characterize with an interpretable score, the keypoints extracted by any method.
arXiv Detail & Related papers (2024-08-30T09:39:59Z) - Poseur: Direct Human Pose Regression with Transformers [119.79232258661995]
We propose a direct, regression-based approach to 2D human pose estimation from single images.
Our framework is end-to-end differentiable, and naturally learns to exploit the dependencies between keypoints.
Ours is the first regression-based approach to perform favorably compared to the best heatmap-based pose estimation methods.
arXiv Detail & Related papers (2022-01-19T04:31:57Z) - Rethinking Keypoint Representations: Modeling Keypoints and Poses as
Objects for Multi-Person Human Pose Estimation [79.78017059539526]
We propose a new heatmap-free keypoint estimation method in which individual keypoints and sets of spatially related keypoints (i.e., poses) are modeled as objects within a dense single-stage anchor-based detection framework.
In experiments, we observe that KAPAO is significantly faster and more accurate than previous methods, which suffer greatly from heatmap post-processing.
Our large model, KAPAO-L, achieves an AP of 70.6 on the Microsoft COCO Keypoints validation set without test-time augmentation.
arXiv Detail & Related papers (2021-11-16T15:36:44Z) - Improving Robustness for Pose Estimation via Stable Heatmap Regression [19.108116394510258]
A heatmap regression method is proposed to alleviate network vulnerability to small perturbations.
A maximum stability training loss is used to simplify the optimization difficulty.
The proposed method achieves a significant advance in robustness over state-of-the-art approaches on two benchmark datasets.
arXiv Detail & Related papers (2021-05-08T03:07:05Z) - Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression [81.05772887221333]
We study the dense keypoint regression framework that is previously inferior to the keypoint detection and grouping framework.
We present a simple yet effective approach, named disentangled keypoint regression (DEKR)
We empirically show that the proposed direct regression method outperforms keypoint detection and grouping methods.
arXiv Detail & Related papers (2021-04-06T05:54:46Z) - Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement [54.29252286561449]
We propose a two-stage graph-based and model-agnostic framework, called Graph-PCNN.
In the first stage, heatmap regression network is applied to obtain a rough localization result, and a set of proposal keypoints, called guided points, are sampled.
In the second stage, for each guided point, different visual feature is extracted by the localization.
The relationship between guided points is explored by the graph pose refinement module to get more accurate localization results.
arXiv Detail & Related papers (2020-07-21T04:59:15Z) - Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive
Keypoint Estimates [76.51095823248104]
We present several schemes that are rarely or unthoroughly studied before for improving keypoint detection and grouping (keypoint regression) performance.
First, we exploit the keypoint heatmaps for pixel-wise keypoint regression instead of separating them for improving keypoint regression.
Second, we adopt a pixel-wise spatial transformer network to learn adaptive representations for handling the scale and orientation variance.
Third, we present a joint shape and heatvalue scoring scheme to promote the estimated poses that are more likely to be true poses.
arXiv Detail & Related papers (2020-06-28T01:14:59Z) - Attentive One-Dimensional Heatmap Regression for Facial Landmark
Detection and Tracking [73.35078496883125]
We propose a novel attentive one-dimensional heatmap regression method for facial landmark localization.
First, we predict two groups of 1D heatmaps to represent the marginal distributions of the x and y coordinates.
Second, a co-attention mechanism is adopted to model the inherent spatial patterns existing in x and y coordinates.
Third, based on the 1D heatmap structures, we propose a facial landmark detector capturing spatial patterns for landmark detection on an image.
arXiv Detail & Related papers (2020-04-05T06:51:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.