Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
- URL: http://arxiv.org/abs/2109.05885v1
- Date: Mon, 13 Sep 2021 11:44:07 GMT
- Title: Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
- Authors: Size Wu, Sheng Jin, Wentao Liu, Lei Bai, Chen Qian, Dong Liu, Wanli
Ouyang
- Abstract summary: We decompose the task into two stages, i.e. person localization and pose estimation.
And we propose three task-specific graph neural networks for effective message passing.
Our approach achieves state-of-the-art performance on CMU Panoptic and Shelf datasets.
- Score: 79.70127290464514
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper studies the task of estimating the 3D human poses of multiple
persons from multiple calibrated camera views. Following the top-down paradigm,
we decompose the task into two stages, i.e. person localization and pose
estimation. Both stages are processed in coarse-to-fine manners. And we propose
three task-specific graph neural networks for effective message passing. For 3D
person localization, we first use Multi-view Matching Graph Module (MMG) to
learn the cross-view association and recover coarse human proposals. The Center
Refinement Graph Module (CRG) further refines the results via flexible
point-based prediction. For 3D pose estimation, the Pose Regression Graph
Module (PRG) learns both the multi-view geometry and structural relations
between human joints. Our approach achieves state-of-the-art performance on CMU
Panoptic and Shelf datasets with significantly lower computation complexity.
Related papers
- Multi-hop graph transformer network for 3D human pose estimation [4.696083734269233]
We introduce a multi-hop graph transformer network designed for 2D-to-3D human pose estimation in videos.
The proposed network architecture consists of a graph attention block composed of stacked layers of multi-head self-attention and graph convolution with learnable adjacency matrix.
The integration of dilated convolutional layers enhances the model's ability to handle spatial generalizations required for accurate localization of the human body joints.
arXiv Detail & Related papers (2024-05-05T21:29:20Z) - Iterative Graph Filtering Network for 3D Human Pose Estimation [5.177947445379688]
Graph convolutional networks (GCNs) have proven to be an effective approach for 3D human pose estimation.
In this paper, we introduce an iterative graph filtering framework for 3D human pose estimation.
Our approach builds upon the idea of iteratively solving graph filtering with Laplacian regularization.
arXiv Detail & Related papers (2023-07-29T20:46:44Z) - Direct Multi-view Multi-person 3D Pose Estimation [138.48139701871213]
We present Multi-view Pose transformer (MvP) for estimating multi-person 3D poses from multi-view images.
MvP directly regresses the multi-person 3D poses in a clean and efficient way, without relying on intermediate tasks.
We show experimentally that our MvP model outperforms the state-of-the-art methods on several benchmarks while being much more efficient.
arXiv Detail & Related papers (2021-11-07T13:09:20Z) - 3D Human Pose Regression using Graph Convolutional Network [68.8204255655161]
We propose a graph convolutional network named PoseGraphNet for 3D human pose regression from 2D poses.
Our model's performance is close to the state-of-the-art, but with much fewer parameters.
arXiv Detail & Related papers (2021-05-21T14:41:31Z) - Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo [71.59494156155309]
Existing approaches for multi-view 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views.
We present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot.
arXiv Detail & Related papers (2021-04-06T03:49:35Z) - Graph Stacked Hourglass Networks for 3D Human Pose Estimation [1.0660480034605242]
We propose a novel graph convolutional network architecture, Graph Stacked Hourglass Networks, for 2D-to-3D human pose estimation tasks.
The proposed architecture consists of repeated encoder-decoder, in which graph-structured features are processed across three different scales of human skeletal representations.
arXiv Detail & Related papers (2021-03-30T14:25:43Z) - Self-Supervised Multi-View Synchronization Learning for 3D Pose
Estimation [39.334995719523]
Current methods cast monocular 3D human pose estimation as a learning problem by training neural networks on large data sets of images and corresponding skeleton poses.
We propose an approach that can exploit small annotated data sets by fine-tuning networks pre-trained via self-supervised learning on (large) unlabeled data sets.
We demonstrate the effectiveness of the synchronization task on the Human3.6M data set and achieve state-of-the-art results in 3D human pose estimation.
arXiv Detail & Related papers (2020-10-13T08:01:24Z) - Unsupervised Cross-Modal Alignment for Multi-Person 3D Pose Estimation [52.94078950641959]
We present a deployment friendly, fast bottom-up framework for multi-person 3D human pose estimation.
We adopt a novel neural representation of multi-person 3D pose which unifies the position of person instances with their corresponding 3D pose representation.
We propose a practical deployment paradigm where paired 2D or 3D pose annotations are unavailable.
arXiv Detail & Related papers (2020-08-04T07:54:25Z) - Fusing Wearable IMUs with Multi-View Images for Human Pose Estimation: A
Geometric Approach [76.10879433430466]
We propose to estimate 3D human pose from multi-view images and a few IMUs attached at person's limbs.
It operates by firstly detecting 2D poses from the two signals, and then lifting them to the 3D space.
The simple two-step approach reduces the error of the state-of-the-art by a large margin on a public dataset.
arXiv Detail & Related papers (2020-03-25T00:26:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.