Joint COCO and Mapillary Workshop at ICCV 2019 Keypoint Detection
Challenge Track Technical Report: Distribution-Aware Coordinate
Representation for Human Pose Estimation
- URL: http://arxiv.org/abs/2003.07232v1
- Date: Fri, 13 Mar 2020 10:22:36 GMT
- Title: Joint COCO and Mapillary Workshop at ICCV 2019 Keypoint Detection
Challenge Track Technical Report: Distribution-Aware Coordinate
Representation for Human Pose Estimation
- Authors: Hanbin Dai, Liangbo Zhou, Feng Zhang, Zhengyu Zhang, Hong Hu, Xiatian
Zhu, Mao Ye
- Abstract summary: We focus on the coordinate representation in human pose estimation.
We propose a principled distribution-aware decoding method.
Taking them together, we formulate a novel Distribution-Aware coordinate Representation for Keypoint (DARK) method.
- Score: 36.73217430761146
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we focus on the coordinate representation in human pose
estimation. While being the standard choice, heatmap based representation has
not been systematically investigated. We found that the process of coordinate
decoding (i.e. transforming the predicted heatmaps to the coordinates) is
surprisingly significant for human pose estimation performance, which
nevertheless was not recognised before. In light of the discovered importance,
we further probe the design limitations of the standard coordinate decoding
method and propose a principled distribution-aware decoding method. Meanwhile,
we improve the standard coordinate encoding process (i.e. transforming
ground-truth coordinates to heatmaps) by generating accurate heatmap
distributions for unbiased model training. Taking them together, we formulate a
novel Distribution-Aware coordinate Representation for Keypoint (DARK) method.
Serving as a model-agnostic plug-in, DARK significantly improves the
performance of a variety of state-of-the-art human pose estimation models.
Extensive experiments show that DARK yields the best results on COCO keypoint
detection challenge, validating the usefulness and effectiveness of our novel
coordinate representation idea. The project page containing more details is at
https://ilovepose.github.io/coco
Related papers
- MDPose: Real-Time Multi-Person Pose Estimation via Mixture Density Model [27.849059115252008]
We propose a novel framework of single-stage instance-aware pose estimation by modeling the joint distribution of human keypoints.
Our MDPose achieves state-of-the-art performance by successfully learning the high-dimensional joint distribution of human keypoints.
arXiv Detail & Related papers (2023-02-17T08:29:33Z) - On Coordinate Decoding for Keypoint Estimation Tasks [22.603579615063495]
A series of 2D (and 3D) keypoint estimation tasks are built upon heatmap coordinate representation.
Heatmap coordinate representation allows for learnable and spatially aware encoding and decoding of keypoint coordinates on grids.
arXiv Detail & Related papers (2021-10-19T22:14:48Z) - Greedy Offset-Guided Keypoint Grouping for Human Pose Estimation [31.468003041368814]
We employ an Hourglass Network to infer all the keypoints from different persons indiscriminately.
We greedily group the candidate keypoints into multiple human poses, utilizing the predicted guiding offsets.
Our approach is comparable to the state of the art on the challenging COCO dataset under fair conditions.
arXiv Detail & Related papers (2021-07-07T09:32:01Z) - Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement [54.29252286561449]
We propose a two-stage graph-based and model-agnostic framework, called Graph-PCNN.
In the first stage, heatmap regression network is applied to obtain a rough localization result, and a set of proposal keypoints, called guided points, are sampled.
In the second stage, for each guided point, different visual feature is extracted by the localization.
The relationship between guided points is explored by the graph pose refinement module to get more accurate localization results.
arXiv Detail & Related papers (2020-07-21T04:59:15Z) - Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences.
We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline.
Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z) - Train Your Data Processor: Distribution-Aware and Error-Compensation
Coordinate Decoding for Human Pose Estimation [14.816632698778049]
We study the heatmap decoding processing with a particular focus on the errors introduced throughout the prediction process.
Thereout propose a Distribution-Aware and Error-Compensation Coordinate Decoding (DAEC)
DAEC learns its decoding strategy from training data and remarkably improves the performance of state-of-the-art human pose estimation models.
arXiv Detail & Related papers (2020-07-12T02:17:29Z) - Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive
Keypoint Estimates [76.51095823248104]
We present several schemes that are rarely or unthoroughly studied before for improving keypoint detection and grouping (keypoint regression) performance.
First, we exploit the keypoint heatmaps for pixel-wise keypoint regression instead of separating them for improving keypoint regression.
Second, we adopt a pixel-wise spatial transformer network to learn adaptive representations for handling the scale and orientation variance.
Third, we present a joint shape and heatvalue scoring scheme to promote the estimated poses that are more likely to be true poses.
arXiv Detail & Related papers (2020-06-28T01:14:59Z) - Attentive One-Dimensional Heatmap Regression for Facial Landmark
Detection and Tracking [73.35078496883125]
We propose a novel attentive one-dimensional heatmap regression method for facial landmark localization.
First, we predict two groups of 1D heatmaps to represent the marginal distributions of the x and y coordinates.
Second, a co-attention mechanism is adopted to model the inherent spatial patterns existing in x and y coordinates.
Third, based on the 1D heatmap structures, we propose a facial landmark detector capturing spatial patterns for landmark detection on an image.
arXiv Detail & Related papers (2020-04-05T06:51:22Z) - Learning Delicate Local Representations for Multi-Person Pose Estimation [77.53144055780423]
We propose a novel method called Residual Steps Network (RSN)
RSN aggregates features with the same spatial size (Intra-level features) efficiently to obtain delicate local representations.
Our approach won the 1st place of COCO Keypoint Challenge 2019.
arXiv Detail & Related papers (2020-03-09T10:40:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.