Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation
- URL: http://arxiv.org/abs/2306.17074v1
- Date: Thu, 29 Jun 2023 16:24:32 GMT
- Title: Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation
- Authors: Zhongwei Qiu, Qiansheng Yang, Jian Wang, Xiyu Wang, Chang Xu, Dongmei
Fu, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang
- Abstract summary: We propose textbfDiffusionPose, a new scheme for learning keypoints heatmaps by a neural network.
During training, the keypoints are diffused to random distribution by adding noises and the diffusion model learns to recover ground-truth heatmaps from noised heatmaps.
Experiments show the prowess of our scheme with improvements of 1.6, 1.2, and 1.2 mAP on widely-used COCO, CrowdPose, and AI Challenge datasets.
- Score: 71.24808323646167
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the mainstream schemes for 2D human pose estimation (HPE) is learning
keypoints heatmaps by a neural network. Existing methods typically improve the
quality of heatmaps by customized architectures, such as high-resolution
representation and vision Transformers. In this paper, we propose
\textbf{DiffusionPose}, a new scheme that formulates 2D HPE as a keypoints
heatmaps generation problem from noised heatmaps. During training, the
keypoints are diffused to random distribution by adding noises and the
diffusion model learns to recover ground-truth heatmaps from noised heatmaps
with respect to conditions constructed by image feature. During inference, the
diffusion model generates heatmaps from initialized heatmaps in a progressive
denoising way. Moreover, we further explore improving the performance of
DiffusionPose with conditions from human structural information. Extensive
experiments show the prowess of our DiffusionPose, with improvements of 1.6,
1.2, and 1.2 mAP on widely-used COCO, CrowdPose, and AI Challenge datasets,
respectively.
Related papers
- Salt & Pepper Heatmaps: Diffusion-informed Landmark Detection Strategy [6.276791657895805]
Anatomical Landmark Detection is a process of identifying key areas of an image for clinical measurements.
A machine learning model predicts the locus of a landmark as a probability region represented by a heatmap.
We reformulate automatic Anatomical Landmark Detection as a precise generative modelling task, producing a few-hot pixel heatmap.
arXiv Detail & Related papers (2024-07-12T11:50:39Z) - DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose
Estimation [16.32910684198013]
We present DiffPose, a novel diffusion architecture that formulates video-based human pose estimation as a conditional heatmap generation problem.
We show two unique characteristics from DiffPose on pose estimation task: (i) the ability to combine multiple sets of pose estimates to improve prediction accuracy, particularly for challenging joints, and (ii) the ability to adjust the number of iterative steps for feature refinement without retraining the model.
arXiv Detail & Related papers (2023-07-31T14:00:23Z) - DistilPose: Tokenized Pose Regression with Heatmap Distillation [81.21273854769765]
We propose a novel human pose estimation framework termed DistilPose, which bridges the gaps between heatmap-based and regression-based methods.
DistilPose maximizes the transfer of knowledge from the teacher model (heatmap-based) to the student model (regression-based) through Token-distilling (TDE) and Simulated Heatmaps.
arXiv Detail & Related papers (2023-03-04T16:56:29Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive
Keypoint Estimates [76.51095823248104]
We present several schemes that are rarely or unthoroughly studied before for improving keypoint detection and grouping (keypoint regression) performance.
First, we exploit the keypoint heatmaps for pixel-wise keypoint regression instead of separating them for improving keypoint regression.
Second, we adopt a pixel-wise spatial transformer network to learn adaptive representations for handling the scale and orientation variance.
Third, we present a joint shape and heatvalue scoring scheme to promote the estimated poses that are more likely to be true poses.
arXiv Detail & Related papers (2020-06-28T01:14:59Z) - A Transfer Learning approach to Heatmap Regression for Action Unit
intensity estimation [50.261472059743845]
Action Units (AUs) are geometrically-based atomic facial muscle movements.
We propose a novel AU modelling problem that consists of jointly estimating their localisation and intensity.
A Heatmap models whether an AU occurs or not at a given spatial location.
arXiv Detail & Related papers (2020-04-14T16:51:13Z) - Attentive One-Dimensional Heatmap Regression for Facial Landmark
Detection and Tracking [73.35078496883125]
We propose a novel attentive one-dimensional heatmap regression method for facial landmark localization.
First, we predict two groups of 1D heatmaps to represent the marginal distributions of the x and y coordinates.
Second, a co-attention mechanism is adopted to model the inherent spatial patterns existing in x and y coordinates.
Third, based on the 1D heatmap structures, we propose a facial landmark detector capturing spatial patterns for landmark detection on an image.
arXiv Detail & Related papers (2020-04-05T06:51:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.