A comprehensive framework for occluded human pose estimation
- URL: http://arxiv.org/abs/2401.00155v2
- Date: Tue, 9 Jan 2024 07:22:13 GMT
- Title: A comprehensive framework for occluded human pose estimation
- Authors: Linhao Xu, Lin Zhao, Xinxin Sun, Di Wang, Guangyu Li, Kedong Yan
- Abstract summary: Occlusion presents a significant challenge in human pose estimation.
We propose DAG (Data, Attention, Graph) to address the performance degradation caused by occluded human pose estimation.
We also present the Feature-Guided Multi-Hop GCN (FGMP-GCN) to fully explore the prior knowledge of body structure and improve pose estimation results.
- Score: 10.92234109536279
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Occlusion presents a significant challenge in human pose estimation. The
challenges posed by occlusion can be attributed to the following factors: 1)
Data: The collection and annotation of occluded human pose samples are
relatively challenging. 2) Feature: Occlusion can cause feature confusion due
to the high similarity between the target person and interfering individuals.
3) Inference: Robust inference becomes challenging due to the loss of complete
body structural information. The existing methods designed for occluded human
pose estimation usually focus on addressing only one of these factors. In this
paper, we propose a comprehensive framework DAG (Data, Attention, Graph) to
address the performance degradation caused by occlusion. Specifically, we
introduce the mask joints with instance paste data augmentation technique to
simulate occlusion scenarios. Additionally, an Adaptive Discriminative
Attention Module (ADAM) is proposed to effectively enhance the features of
target individuals. Furthermore, we present the Feature-Guided Multi-Hop GCN
(FGMP-GCN) to fully explore the prior knowledge of body structure and improve
pose estimation results. Through extensive experiments conducted on three
benchmark datasets for occluded human pose estimation, we demonstrate that the
proposed method outperforms existing methods. Code and data will be publicly
available.
Related papers
- Occluded Human Pose Estimation based on Limb Joint Augmentation [14.36131862057872]
We propose an occluded human pose estimation framework based on limb joint augmentation to enhance the generalization ability of the pose estimation model on the occluded human bodies.
To further enhance the localization ability of the model, this paper constructs a dynamic structure loss function based on limb graphs to explore the distribution of occluded joints.
arXiv Detail & Related papers (2024-10-13T15:48:24Z) - DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery [71.6345505427213]
DPMesh is an innovative framework for occluded human mesh recovery.
It capitalizes on the profound diffusion prior about object structure and spatial relationships embedded in a pre-trained text-to-image diffusion model.
arXiv Detail & Related papers (2024-04-01T18:59:13Z) - 3D Human Pose Analysis via Diffusion Synthesis [65.268245109828]
PADS represents the first diffusion-based framework for tackling general 3D human pose analysis within the inverse problem framework.
Its performance has been validated on different benchmarks, signaling the adaptability and robustness of this pipeline.
arXiv Detail & Related papers (2024-01-17T02:59:34Z) - Feature Completion Transformer for Occluded Person Re-identification [25.159974510754992]
Occluded person re-identification (Re-ID) is a challenging problem due to the destruction of occluders.
We propose a Feature Completion Transformer (FCFormer) to implicitly complement the semantic information of occluded parts in the feature space.
FCFormer achieves superior performance and outperforms the state-of-the-art methods by significant margins on occluded datasets.
arXiv Detail & Related papers (2023-03-03T01:12:57Z) - Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation [33.86986028882488]
Occlusion poses a great threat to monocular multi-person 3D human pose estimation due to large variability in terms of the shape, appearance, and position of occluders.
Existing methods try to handle occlusion with pose priors/constraints, data augmentation, or implicit reasoning.
We develop a method to explicitly model this process that significantly improves bottom-up multi-person human pose estimation.
arXiv Detail & Related papers (2022-07-29T22:12:50Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Generative Partial Visual-Tactile Fused Object Clustering [81.17645983141773]
We propose a Generative Partial Visual-Tactile Fused (i.e., GPVTF) framework for object clustering.
A conditional cross-modal clustering generative adversarial network is then developed to synthesize one modality conditioning on the other modality.
To the end, two pseudo-label based KL-divergence losses are employed to update the corresponding modality-specific encoders.
arXiv Detail & Related papers (2020-12-28T02:37:03Z) - AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in
the Wild [77.43884383743872]
We present AdaFuse, an adaptive multiview fusion method to enhance the features in occluded views.
We extensively evaluate the approach on three public datasets including Human3.6M, Total Capture and CMU Panoptic.
We also create a large scale synthetic dataset Occlusion-Person, which allows us to perform numerical evaluation on the occluded joints.
arXiv Detail & Related papers (2020-10-26T03:19:46Z) - Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View
Geometry [62.29762409558553]
Epipolar constraints are at the core of feature matching and depth estimation in multi-person 3D human pose estimation methods.
Despite the satisfactory performance of this formulation in sparser crowd scenes, its effectiveness is frequently challenged under denser crowd circumstances.
In this paper, we depart from the multi-person 3D pose estimation formulation, and instead reformulate it as crowd pose estimation.
arXiv Detail & Related papers (2020-07-21T17:59:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.