Conditional Directed Graph Convolution for 3D Human Pose Estimation
- URL: http://arxiv.org/abs/2107.07797v1
- Date: Fri, 16 Jul 2021 09:50:40 GMT
- Title: Conditional Directed Graph Convolution for 3D Human Pose Estimation
- Authors: Wenbo Hu, Changgong Zhang, Fangneng Zhan, Lei Zhang, Tien-Tsin Wong
- Abstract summary: Graph convolutional networks have significantly improved 3D human pose estimation by representing the human skeleton as an undirected graph.
This paper proposes to represent the human skeleton as a directed graph with the joints as nodes and bones as edges that are directed from parent joints to child joints.
- Score: 23.376538132362498
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph convolutional networks have significantly improved 3D human pose
estimation by representing the human skeleton as an undirected graph. However,
this representation fails to reflect the articulated characteristic of human
skeletons as the hierarchical orders among the joints are not explicitly
presented. In this paper, we propose to represent the human skeleton as a
directed graph with the joints as nodes and bones as edges that are directed
from parent joints to child joints. By so doing, the directions of edges can
explicitly reflect the hierarchical relationships among the nodes. Based on
this representation, we adopt the spatial-temporal directed graph convolution
(ST-DGConv) to extract features from 2D poses represented in a temporal
sequence of directed graphs. We further propose a spatial-temporal conditional
directed graph convolution (ST-CondDGConv) to leverage varying non-local
dependence for different poses by conditioning the graph topology on input
poses. Altogether, we form a U-shaped network with ST-DGConv and ST-CondDGConv
layers, named U-shaped Conditional Directed Graph Convolutional Network
(U-CondDGCN), for 3D human pose estimation from monocular videos. To evaluate
the effectiveness of our U-CondDGCN, we conducted extensive experiments on two
challenging large-scale benchmarks: Human3.6M and MPI-INF-3DHP. Both
quantitative and qualitative results show that our method achieves top
performance. Also, ablation studies show that directed graphs can better
exploit the hierarchy of articulated human skeletons than undirected graphs,
and the conditional connections can yield adaptive graph topologies for
different kinds of poses.
Related papers
- Dynamic Dense Graph Convolutional Network for Skeleton-based Human
Motion Prediction [14.825185477750479]
This paper presents a Dynamic Dense Graph Convolutional Network (DD-GCN) which constructs a dense graph and implements an integrated dynamic message passing.
Based on the dense graph, we propose a dynamic message passing framework that learns dynamically from data to generate distinctive messages.
Experiments on benchmark Human 3.6M and CMU Mocap datasets verify the effectiveness of our DD-GCN.
arXiv Detail & Related papers (2023-11-29T07:25:49Z) - Iterative Graph Filtering Network for 3D Human Pose Estimation [5.177947445379688]
Graph convolutional networks (GCNs) have proven to be an effective approach for 3D human pose estimation.
In this paper, we introduce an iterative graph filtering framework for 3D human pose estimation.
Our approach builds upon the idea of iteratively solving graph filtering with Laplacian regularization.
arXiv Detail & Related papers (2023-07-29T20:46:44Z) - Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction [120.08257447708503]
Graph convolutional network based methods that model the body-joints' relations, have recently shown great promise in 3D skeleton-based human motion prediction.
We propose a novel skeleton-parted graph scattering network (SPGSN)
SPGSN outperforms state-of-the-art methods by remarkable margins of 13.8%, 9.3% and 2.7% in terms of 3D mean per joint position error (MPJPE) on Human3.6M, CMU Mocap and 3DPW datasets, respectively.
arXiv Detail & Related papers (2022-07-31T05:51:39Z) - Hierarchical Graph Networks for 3D Human Pose Estimation [50.600944798627786]
Recent 2D-to-3D human pose estimation works tend to utilize the graph structure formed by the topology of the human skeleton.
We argue that this skeletal topology is too sparse to reflect the body structure and suffer from serious 2D-to-3D ambiguity problem.
We propose a novel graph convolution network architecture, Hierarchical Graph Networks, to overcome these weaknesses.
arXiv Detail & Related papers (2021-11-23T15:09:03Z) - Joint 3D Human Shape Recovery from A Single Imag with Bilayer-Graph [35.375489948345404]
We propose a dual-scale graph approach to estimate the 3D human shape and pose from images.
We use a coarse graph, derived from a dense graph, to estimate the human's 3D pose, and the dense graph to estimate the 3D shape.
We train our model end-to-end and show that we can achieve state-of-the-art results for several evaluation datasets.
arXiv Detail & Related papers (2021-10-16T05:04:02Z) - Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images [79.70127290464514]
We decompose the task into two stages, i.e. person localization and pose estimation.
And we propose three task-specific graph neural networks for effective message passing.
Our approach achieves state-of-the-art performance on CMU Panoptic and Shelf datasets.
arXiv Detail & Related papers (2021-09-13T11:44:07Z) - 3D Human Pose Regression using Graph Convolutional Network [68.8204255655161]
We propose a graph convolutional network named PoseGraphNet for 3D human pose regression from 2D poses.
Our model's performance is close to the state-of-the-art, but with much fewer parameters.
arXiv Detail & Related papers (2021-05-21T14:41:31Z) - An Adversarial Human Pose Estimation Network Injected with Graph
Structure [75.08618278188209]
In this paper, we design a novel generative adversarial network (GAN) to improve the localization accuracy of visible joints when some joints are invisible.
The network consists of two simple but efficient modules, Cascade Feature Network (CFN) and Graph Structure Network (GSN)
arXiv Detail & Related papers (2021-03-29T12:07:08Z) - Mix Dimension in Poincar\'{e} Geometry for 3D Skeleton-based Action
Recognition [57.98278794950759]
Graph Convolutional Networks (GCNs) have already demonstrated their powerful ability to model the irregular data.
We present a novel spatial-temporal GCN architecture which is defined via the Poincar'e geometry.
We evaluate our method on two current largest scale 3D datasets.
arXiv Detail & Related papers (2020-07-30T18:23:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.