3D WholeBody Pose Estimation based on Semantic Graph Attention Network and Distance Information
- URL: http://arxiv.org/abs/2406.01196v1
- Date: Mon, 3 Jun 2024 10:59:00 GMT
- Title: 3D WholeBody Pose Estimation based on Semantic Graph Attention Network and Distance Information
- Authors: Sihan Wen, Xiantan Zhu, Zhiming Tan,
- Abstract summary: A novel Semantic Graph Attention Network can benefit from the ability of self-attention to capture global context.
A Body Part Decoder assists in extracting and refining the information related to specific segments of the body.
A Geometry Loss makes a critical constraint on the structural skeleton of the body, ensuring that the model's predictions adhere to the natural limits of human posture.
- Score: 2.457872341625575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, a plethora of diverse methods have been proposed for 3D pose estimation. Among these, self-attention mechanisms and graph convolutions have both been proven to be effective and practical methods. Recognizing the strengths of those two techniques, we have developed a novel Semantic Graph Attention Network which can benefit from the ability of self-attention to capture global context, while also utilizing the graph convolutions to handle the local connectivity and structural constraints of the skeleton. We also design a Body Part Decoder that assists in extracting and refining the information related to specific segments of the body. Furthermore, our approach incorporates Distance Information, enhancing our model's capability to comprehend and accurately predict spatial relationships. Finally, we introduce a Geometry Loss who makes a critical constraint on the structural skeleton of the body, ensuring that the model's predictions adhere to the natural limits of human posture. The experimental results validate the effectiveness of our approach, demonstrating that every element within the system is essential for improving pose estimation outcomes. With comparison to state-of-the-art, the proposed work not only meets but exceeds the existing benchmarks.
Related papers
- Hand-object reconstruction via interaction-aware graph attention mechanism [25.396356108313178]
Estimating the poses of both a hand and an object has become an important area of research.
We propose a graph-based refinement method that incorporates an interaction-aware graph-attention mechanism.
Experiments demonstrate the effectiveness of our proposed method with notable improvements in the realm of physical plausibility.
arXiv Detail & Related papers (2024-09-26T08:23:04Z) - Towards Robust and Expressive Whole-body Human Pose and Shape Estimation [51.457517178632756]
Whole-body pose and shape estimation aims to jointly predict different behaviors of the entire human body from a monocular image.
Existing methods often exhibit degraded performance under the complexity of in-the-wild scenarios.
We propose a novel framework to enhance the robustness of whole-body pose and shape estimation.
arXiv Detail & Related papers (2023-12-14T08:17:42Z) - Human as Points: Explicit Point-based 3D Human Reconstruction from
Single-view RGB Images [78.56114271538061]
We introduce an explicit point-based human reconstruction framework called HaP.
Our approach is featured by fully-explicit point cloud estimation, manipulation, generation, and refinement in the 3D geometric space.
Our results may indicate a paradigm rollback to the fully-explicit and geometry-centric algorithm design.
arXiv Detail & Related papers (2023-11-06T05:52:29Z) - Spatio-temporal MLP-graph network for 3D human pose estimation [8.267311047244881]
Graph convolutional networks and their variants have shown significant promise in 3D human pose estimation.
We introduce a new weighted Jacobi feature rule obtained through graph filtering with implicit propagation fairing.
We also employ adjacency modulation with the aim of learning meaningful correlations beyond defined between body joints.
arXiv Detail & Related papers (2023-08-29T14:00:55Z) - 2D Human Pose Estimation: A Survey [16.56050212383859]
Human pose estimation aims at localizing human anatomical keypoints or body parts in the input data.
Deep learning techniques allow learning feature representations directly from the data.
In this paper, we reap the recent achievements of 2D human pose estimation methods and present a comprehensive survey.
arXiv Detail & Related papers (2022-04-15T08:09:43Z) - On Triangulation as a Form of Self-Supervision for 3D Human Pose
Estimation [57.766049538913926]
Supervised approaches to 3D pose estimation from single images are remarkably effective when labeled data is abundant.
Much of the recent attention has shifted towards semi and (or) weakly supervised learning.
We propose to impose multi-view geometrical constraints by means of a differentiable triangulation and to use it as form of self-supervision during training when no labels are available.
arXiv Detail & Related papers (2022-03-29T19:11:54Z) - Learning Dynamics via Graph Neural Networks for Human Pose Estimation
and Tracking [98.91894395941766]
We propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame.
Specifically, we derive this prediction of dynamics through a graph neural network(GNN) that explicitly accounts for both spatial-temporal and visual information.
Experiments on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed method achieves results superior to the state of the art on both human pose estimation and tracking tasks.
arXiv Detail & Related papers (2021-06-07T16:36:50Z) - Enhanced 3D Human Pose Estimation from Videos by using Attention-Based
Neural Network with Dilated Convolutions [12.900524511984798]
We show a systematic design for how conventional networks and other forms of constraints can be incorporated into the attention framework.
We achieve this by adapting temporal receptive field via a multi-scale structure of dilated convolutions.
Our method achieves the state-of-the-art performance and outperforms existing methods by reducing the mean per joint position error to 33.4 mm on Human3.6M dataset.
arXiv Detail & Related papers (2021-03-04T17:26:51Z) - Kinematic-Structure-Preserved Representation for Unsupervised 3D Human
Pose Estimation [58.72192168935338]
Generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable.
We propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions.
Our proposed model employs three consecutive differentiable transformations named as forward-kinematics, camera-projection and spatial-map transformation.
arXiv Detail & Related papers (2020-06-24T23:56:33Z) - Structured Landmark Detection via Topology-Adapting Deep Graph Learning [75.20602712947016]
We present a new topology-adapting deep graph learning approach for accurate anatomical facial and medical landmark detection.
The proposed method constructs graph signals leveraging both local image features and global shape features.
Experiments are conducted on three public facial image datasets (WFLW, 300W, and COFW-68) as well as three real-world X-ray medical datasets (Cephalometric (public), Hand and Pelvis)
arXiv Detail & Related papers (2020-04-17T11:55:03Z) - Unifying Graph Embedding Features with Graph Convolutional Networks for
Skeleton-based Action Recognition [18.001693718043292]
We propose a novel framework, which unifies 15 graph embedding features into the graph convolutional network for human action recognition.
Our model is validated by three large-scale datasets, namely NTU-RGB+D, Kinetics and SYSU-3D.
arXiv Detail & Related papers (2020-03-06T02:31:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.