A Coarse-to-Fine Human Pose Estimation Method based on Two-stage Distillation and Progressive Graph Neural Network
- URL: http://arxiv.org/abs/2508.11212v1
- Date: Fri, 15 Aug 2025 04:41:49 GMT
- Title: A Coarse-to-Fine Human Pose Estimation Method based on Two-stage Distillation and Progressive Graph Neural Network
- Authors: Zhangjian Ji, Wenjin Zhang, Shaotong Qiao, Kai Feng, Yuhua Qian,
- Abstract summary: We propose a novel coarse-to-fine two-stage knowledge distillation framework for human pose estimation.<n>In the first-stage distillation, we introduce the human joints structure loss to mine the structural information among human joints.<n>In the second-stage distillation, we utilize an Image-Guided Progressive Graph Convolutional Network (IGP-GCN) to refine the initial human pose.
- Score: 13.555932146811658
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Human pose estimation has been widely applied in the human-centric understanding and generation, but most existing state-of-the-art human pose estimation methods require heavy computational resources for accurate predictions. In order to obtain an accurate, robust yet lightweight human pose estimator, one feasible way is to transfer pose knowledge from a powerful teacher model to a less-parameterized student model by knowledge distillation. However, the traditional knowledge distillation framework does not fully explore the contextual information among human joints. Thus, in this paper, we propose a novel coarse-to-fine two-stage knowledge distillation framework for human pose estimation. In the first-stage distillation, we introduce the human joints structure loss to mine the structural information among human joints so as to transfer high-level semantic knowledge from the teacher model to the student model. In the second-stage distillation, we utilize an Image-Guided Progressive Graph Convolutional Network (IGP-GCN) to refine the initial human pose obtained from the first-stage distillation and supervise the training of the IGP-GCN in the progressive way by the final output pose of teacher model. The extensive experiments on the benchmark dataset: COCO keypoint and CrowdPose datasets, show that our proposed method performs favorably against lots of the existing state-of-the-art human pose estimation methods, especially for the more complex CrowdPose dataset, the performance improvement of our model is more significant.
Related papers
- UniSH: Unifying Scene and Human Reconstruction in a Feed-Forward Pass [83.7071371474926]
UniSH is a unified, feed-forward framework for joint metric-scale 3D scene and human reconstruction.<n>Our framework bridges strong, disparate priors from scene reconstruction and HMR.<n>Our model achieves state-of-the-art performance on human-centric scene reconstruction.
arXiv Detail & Related papers (2026-01-03T16:06:27Z) - A New Teacher-Reviewer-Student Framework for Semi-supervised 2D Human Pose Estimation [33.01458098153753]
We propose a novel semi-supervised 2D human pose estimation method by utilizing a newly designed Teacher-Reviewer-Student framework.<n>Specifically, we first mimic the phenomenon that human beings constantly review previous knowledge for consolidation to design our framework.<n> Secondly, we introduce a Multi-level Feature Learning strategy, which utilizes the outputs from different stages of the backbone to estimate the heatmap to guide network training.
arXiv Detail & Related papers (2025-01-16T14:40:02Z) - Denoising and Selecting Pseudo-Heatmaps for Semi-Supervised Human Pose
Estimation [38.97427474379367]
We introduce a denoising scheme to generate reliable pseudo-heatmaps as targets for learning from unlabeled data.
We select the learning targets from these pseudo-heatmaps guided by the estimated cross-student uncertainty.
Our results show that our model outperforms previous state-of-the-art semi-supervised pose estimators.
arXiv Detail & Related papers (2023-09-29T19:17:30Z) - Effective Whole-body Pose Estimation with Two-stages Distillation [52.92064408970796]
Whole-body pose estimation localizes the human body, hand, face, and foot keypoints in an image.
We present a two-stage pose textbfDistillation for textbfWhole-body textbfPose estimators, named textbfDWPose, to improve their effectiveness and efficiency.
arXiv Detail & Related papers (2023-07-29T03:49:28Z) - Rethinking pose estimation in crowds: overcoming the detection
information-bottleneck and ambiguity [46.10812760258666]
Frequent interactions between individuals are a fundamental challenge for pose estimation algorithms.
We propose a novel pipeline called bottom-up conditioned top-down pose estimation.
We demonstrate the performance and efficiency of our approach on animal and human pose estimation benchmarks.
arXiv Detail & Related papers (2023-06-13T16:14:40Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Bottom-Up 2D Pose Estimation via Dual Anatomical Centers for Small-Scale
Persons [75.86463396561744]
In multi-person 2D pose estimation, the bottom-up methods simultaneously predict poses for all persons.
Our method achieves 38.4% improvement on bounding box precision and 39.1% improvement on bounding box recall over the state of the art (SOTA)
For the human pose AP evaluation, we achieve a new SOTA (71.0 AP) on the COCO test-dev set with the single-scale testing.
arXiv Detail & Related papers (2022-08-25T10:09:10Z) - Low-resolution Human Pose Estimation [49.531572116079026]
We propose a novel Confidence-Aware Learning (CAL) method for low-resolution pose estimation.
CAL addresses two fundamental limitations of existing offset learning methods: inconsistent training and testing, decoupled heatmap and offset learning.
Our method outperforms significantly the state-of-the-art methods for low-resolution human pose estimation.
arXiv Detail & Related papers (2021-09-19T09:13:57Z) - Online Knowledge Distillation for Efficient Pose Estimation [37.81478634850458]
We investigate a novel Online Knowledge Distillation framework by distilling Human Pose structure knowledge in a one-stage manner.
OKDHP trains a single multi-branch network and acquires the predicted heatmaps from each.
The pixel-wise Kullback-Leibler divergence is utilized to minimize the discrepancy between the target heatmaps and the predicted ones.
arXiv Detail & Related papers (2021-08-04T14:49:44Z) - Data-Efficient Ranking Distillation for Image Retrieval [15.88955427198763]
Recent approaches tackle this issue using knowledge distillation to transfer knowledge from a deeper and heavier architecture to a much smaller network.
In this paper we address knowledge distillation for metric learning problems.
Unlike previous approaches, our proposed method jointly addresses the following constraints i) limited queries to teacher model, ii) black box teacher model with access to the final output representation, andiii) small fraction of original training data without any ground-truth labels.
arXiv Detail & Related papers (2020-07-10T10:59:16Z) - Cascaded deep monocular 3D human pose estimation with evolutionary
training data [76.3478675752847]
Deep representation learning has achieved remarkable accuracy for monocular 3D human pose estimation.
This paper proposes a novel data augmentation method that is scalable for massive amount of training data.
Our method synthesizes unseen 3D human skeletons based on a hierarchical human representation and synthesizings inspired by prior knowledge.
arXiv Detail & Related papers (2020-06-14T03:09:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.