Related papers: Greedy Offset-Guided Keypoint Grouping for Human Pose Estimation

Greedy Offset-Guided Keypoint Grouping for Human Pose Estimation

URL: http://arxiv.org/abs/2107.03098v1
Date: Wed, 7 Jul 2021 09:32:01 GMT
Title: Greedy Offset-Guided Keypoint Grouping for Human Pose Estimation
Authors: Jia Li, Linhua Xiang, Jiwei Chen, Zengfu Wang
Abstract summary: We employ an Hourglass Network to infer all the keypoints from different persons indiscriminately. We greedily group the candidate keypoints into multiple human poses, utilizing the predicted guiding offsets. Our approach is comparable to the state of the art on the challenging COCO dataset under fair conditions.
Score: 31.468003041368814
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a simple yet reliable bottom-up approach with a good trade-off between accuracy and efficiency for the problem of multi-person pose estimation. Given an image, we employ an Hourglass Network to infer all the keypoints from different persons indiscriminately as well as the guiding offsets connecting the adjacent keypoints belonging to the same persons. Then, we greedily group the candidate keypoints into multiple human poses (if any), utilizing the predicted guiding offsets. And we refer to this process as greedy offset-guided keypoint grouping (GOG). Moreover, we revisit the encoding-decoding method for the multi-person keypoint coordinates and reveal some important facts affecting accuracy. Experiments have demonstrated the obvious performance improvements brought by the introduced components. Our approach is comparable to the state of the art on the challenging COCO dataset under fair conditions. The source code and our pre-trained model are publicly available online.

Related papers

Hybrid model for Single-Stage Multi-Person Pose Estimation [3.592448408054345]
We propose a hybrid model for single-stage multi-person pose estimation, named HybridPose. It is capable of not only detecting densely placed keypoints, but also filtering the non-existent keypoints in an image.
arXiv Detail & Related papers (2023-05-02T02:55:29Z)
MDPose: Real-Time Multi-Person Pose Estimation via Mixture Density Model [27.849059115252008]
We propose a novel framework of single-stage instance-aware pose estimation by modeling the joint distribution of human keypoints. Our MDPose achieves state-of-the-art performance by successfully learning the high-dimensional joint distribution of human keypoints.
arXiv Detail & Related papers (2023-02-17T08:29:33Z)
AdaptivePose++: A Powerful Single-Stage Network for Multi-Person Pose Regression [66.39539141222524]
We propose to represent the human parts as adaptive points and introduce a fine-grained body representation method. With the proposed body representation, we deliver a compact single-stage multi-person pose regression network, termed as AdaptivePose. We employ AdaptivePose for both 2D/3D multi-person pose estimation tasks to verify the effectiveness of AdaptivePose.
arXiv Detail & Related papers (2022-10-08T12:54:20Z)
Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation [79.78017059539526]
We propose a new heatmap-free keypoint estimation method in which individual keypoints and sets of spatially related keypoints (i.e., poses) are modeled as objects within a dense single-stage anchor-based detection framework. In experiments, we observe that KAPAO is significantly faster and more accurate than previous methods, which suffer greatly from heatmap post-processing. Our large model, KAPAO-L, achieves an AP of 70.6 on the Microsoft COCO Keypoints validation set without test-time augmentation.
arXiv Detail & Related papers (2021-11-16T15:36:44Z)
Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression [81.05772887221333]
We study the dense keypoint regression framework that is previously inferior to the keypoint detection and grouping framework. We present a simple yet effective approach, named disentangled keypoint regression (DEKR) We empirically show that the proposed direct regression method outperforms keypoint detection and grouping methods.
arXiv Detail & Related papers (2021-04-06T05:54:46Z)
Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation [95.72606536493548]
Multi-person pose estimation is challenging because it localizes body keypoints for multiple persons simultaneously. We propose a novel differentiable Hierarchical Graph Grouping (HGG) method to learn the graph grouping in bottom-up multi-person pose estimation task.
arXiv Detail & Related papers (2020-07-23T08:46:22Z)
Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive Keypoint Estimates [76.51095823248104]
We present several schemes that are rarely or unthoroughly studied before for improving keypoint detection and grouping (keypoint regression) performance. First, we exploit the keypoint heatmaps for pixel-wise keypoint regression instead of separating them for improving keypoint regression. Second, we adopt a pixel-wise spatial transformer network to learn adaptive representations for handling the scale and orientation variance. Third, we present a joint shape and heatvalue scoring scheme to promote the estimated poses that are more likely to be true poses.
arXiv Detail & Related papers (2020-06-28T01:14:59Z)
Joint COCO and Mapillary Workshop at ICCV 2019 Keypoint Detection Challenge Track Technical Report: Distribution-Aware Coordinate Representation for Human Pose Estimation [36.73217430761146]
We focus on the coordinate representation in human pose estimation. We propose a principled distribution-aware decoding method. Taking them together, we formulate a novel Distribution-Aware coordinate Representation for Keypoint (DARK) method.
arXiv Detail & Related papers (2020-03-13T10:22:36Z)
Towards High Performance Human Keypoint Detection [87.1034745775229]
We find that context information plays an important role in reasoning human body configuration and invisible keypoints. Inspired by this, we propose a cascaded context mixer ( CCM) which efficiently integrates spatial and channel context information. To maximize CCM's representation capability, we develop a hard-negative person detection mining strategy and a joint-training strategy. We present several sub-pixel refinement techniques for postprocessing keypoint predictions to improve detection accuracy.
arXiv Detail & Related papers (2020-02-03T02:24:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.