ZoomNAS: Searching for Whole-body Human Pose Estimation in the Wild
- URL: http://arxiv.org/abs/2208.11547v1
- Date: Tue, 23 Aug 2022 16:33:57 GMT
- Title: ZoomNAS: Searching for Whole-body Human Pose Estimation in the Wild
- Authors: Lumin Xu, Sheng Jin, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo,
Xiaogang Wang
- Abstract summary: We propose a single-network approach, termed ZoomNet, to take into account the hierarchical structure of the full human body.
We also propose a neural architecture search framework, termed ZoomNAS, to promote both the accuracy and efficiency of whole-body pose estimation.
To train and evaluate ZoomNAS, we introduce the first large-scale 2D human whole-body dataset, namely COCO-WholeBody V1.0.
- Score: 97.0025378036642
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper investigates the task of 2D whole-body human pose estimation,
which aims to localize dense landmarks on the entire human body including body,
feet, face, and hands. We propose a single-network approach, termed ZoomNet, to
take into account the hierarchical structure of the full human body and solve
the scale variation of different body parts. We further propose a neural
architecture search framework, termed ZoomNAS, to promote both the accuracy and
efficiency of whole-body pose estimation. ZoomNAS jointly searches the model
architecture and the connections between different sub-modules, and
automatically allocates computational complexity for searched sub-modules. To
train and evaluate ZoomNAS, we introduce the first large-scale 2D human
whole-body dataset, namely COCO-WholeBody V1.0, which annotates 133 keypoints
for in-the-wild images. Extensive experiments demonstrate the effectiveness of
ZoomNAS and the significance of COCO-WholeBody V1.0.
Related papers
- WHAC: World-grounded Humans and Cameras [37.877565981937586]
We aim to recover expressive parametric human models (i.e., SMPL-X) and corresponding camera poses jointly.
We introduce a novel framework, referred to as WHAC, to facilitate world-grounded expressive human pose and shape estimation.
We present a new synthetic dataset, WHAC-A-Mole, which includes accurately annotated humans and cameras.
arXiv Detail & Related papers (2024-03-19T17:58:02Z) - MUC: Mixture of Uncalibrated Cameras for Robust 3D Human Body Reconstruction [12.942635715952525]
Multiple cameras can provide comprehensive multi-view video coverage of a person.
Previous studies have overlooked the challenges posed by self-occlusion under multiple views.
We introduce a method to reconstruct the 3D human body from multiple uncalibrated camera views.
arXiv Detail & Related papers (2024-03-08T05:03:25Z) - Full-Body Articulated Human-Object Interaction [61.01135739641217]
CHAIRS is a large-scale motion-captured f-AHOI dataset consisting of 16.2 hours of versatile interactions.
CHAIRS provides 3D meshes of both humans and articulated objects during the entire interactive process.
By learning the geometrical relationships in HOI, we devise the very first model that leverage human pose estimation.
arXiv Detail & Related papers (2022-12-20T19:50:54Z) - Higher-Order Implicit Fairing Networks for 3D Human Pose Estimation [1.1501261942096426]
We introduce a higher-order graph convolutional framework with initial residual connections for 2D-to-3D pose estimation.
Our model is able to capture the long-range dependencies between body joints.
Experiments and ablations studies conducted on two standard benchmarks demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2021-11-01T13:48:55Z) - Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images [79.70127290464514]
We decompose the task into two stages, i.e. person localization and pose estimation.
And we propose three task-specific graph neural networks for effective message passing.
Our approach achieves state-of-the-art performance on CMU Panoptic and Shelf datasets.
arXiv Detail & Related papers (2021-09-13T11:44:07Z) - Pose-based Modular Network for Human-Object Interaction Detection [5.6397911482914385]
We contribute a Pose-based Modular Network (PMN) which explores the absolute pose features and relative spatial pose features to improve HOI detection.
To evaluate our proposed method, we combine the module with the state-of-the-art model named VS-GATs and obtain significant improvement on two public benchmarks.
arXiv Detail & Related papers (2020-08-05T10:56:09Z) - Whole-Body Human Pose Estimation in the Wild [88.09875133989155]
COCO-WholeBody extends COCO dataset with whole-body annotations.
It is the first benchmark that has manual annotations on the entire human body.
A single-network model, named ZoomNet, is devised to take into account the hierarchical structure of the full human body.
arXiv Detail & Related papers (2020-07-23T08:35:26Z) - HDNet: Human Depth Estimation for Multi-Person Camera-Space Localization [83.57863764231655]
We propose the Human Depth Estimation Network (HDNet), an end-to-end framework for absolute root joint localization.
A skeleton-based Graph Neural Network (GNN) is utilized to propagate features among joints.
We evaluate our HDNet on the root joint localization and root-relative 3D pose estimation tasks with two benchmark datasets.
arXiv Detail & Related papers (2020-07-17T12:44:23Z) - Anatomy-aware 3D Human Pose Estimation with Bone-based Pose
Decomposition [92.99291528676021]
Instead of directly regressing the 3D joint locations, we decompose the task into bone direction prediction and bone length prediction.
Our motivation is the fact that the bone lengths of a human skeleton remain consistent across time.
Our full model outperforms the previous best results on Human3.6M and MPI-INF-3DHP datasets.
arXiv Detail & Related papers (2020-02-24T15:49:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.