X-HRNet: Towards Lightweight Human Pose Estimation with Spatially
Unidimensional Self-Attention
- URL: http://arxiv.org/abs/2310.08042v1
- Date: Thu, 12 Oct 2023 05:33:25 GMT
- Title: X-HRNet: Towards Lightweight Human Pose Estimation with Spatially
Unidimensional Self-Attention
- Authors: Yixuan Zhou, Xuanhan Wang, Xing Xu, Lei Zhao, Jingkuan Song
- Abstract summary: In particular, predominant pose estimation methods estimate human joints by 2D single-peak heatmaps.
We introduce a lightweight and powerful alternative, Spatially Unidimensional Self-Attention (SUSA), to the pointwise (1x1) convolution.
Our SUSA reduces the computational complexity of the pointwise (1x1) convolution by 96% without sacrificing accuracy.
- Score: 63.64944381130373
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-resolution representation is necessary for human pose estimation to
achieve high performance, and the ensuing problem is high computational
complexity. In particular, predominant pose estimation methods estimate human
joints by 2D single-peak heatmaps. Each 2D heatmap can be horizontally and
vertically projected to and reconstructed by a pair of 1D heat vectors.
Inspired by this observation, we introduce a lightweight and powerful
alternative, Spatially Unidimensional Self-Attention (SUSA), to the pointwise
(1x1) convolution that is the main computational bottleneck in the depthwise
separable 3c3 convolution. Our SUSA reduces the computational complexity of the
pointwise (1x1) convolution by 96% without sacrificing accuracy. Furthermore,
we use the SUSA as the main module to build our lightweight pose estimation
backbone X-HRNet, where `X' represents the estimated cross-shape attention
vectors. Extensive experiments on the COCO benchmark demonstrate the
superiority of our X-HRNet, and comprehensive ablation studies show the
effectiveness of the SUSA modules. The code is publicly available at
https://github.com/cool-xuan/x-hrnet.
Related papers
- DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses [59.51874686414509]
Current approaches approximate the continuous pose representation with a large number of discrete pose hypotheses.
We present a Deep Voxel Matching Network (DVMNet) that eliminates the need for pose hypotheses and computes the relative object pose in a single pass.
Our method delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z) - Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic
Projection [24.964926464973026]
Voxel-based methods have achieved promising results for multi-person 3D pose estimation from multi-cameras.
We present Faster VoxelPose to address the challenge by re-projecting the feature volume to the three two-dimensional coordinate planes.
Method is free from costly 3D-CNNs and improves the speed of VoxelPose by ten times.
arXiv Detail & Related papers (2022-07-22T09:10:01Z) - Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation [35.765304656180355]
We study efficient architecture design for real-time multi-person pose estimation on edge.
Inspired by this finding, we design LitePose, an efficient single-branch architecture for pose estimation.
We introduce two simple approaches to enhance the capacity of LitePose, including Fusion Deconv Head and Large Kernel Convs.
arXiv Detail & Related papers (2022-05-03T02:08:04Z) - Higher-Order Implicit Fairing Networks for 3D Human Pose Estimation [1.1501261942096426]
We introduce a higher-order graph convolutional framework with initial residual connections for 2D-to-3D pose estimation.
Our model is able to capture the long-range dependencies between body joints.
Experiments and ablations studies conducted on two standard benchmarks demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2021-11-01T13:48:55Z) - HandVoxNet++: 3D Hand Shape and Pose Estimation using Voxel-Based Neural
Networks [71.09275975580009]
HandVoxNet++ is a voxel-based deep network with 3D and graph convolutions trained in a fully supervised manner.
HandVoxNet++ relies on two hand shape representations. The first one is the 3D voxelized grid of hand shape, which does not preserve the mesh topology.
We combine the advantages of both representations by aligning the hand surface to the voxelized hand shape either with a new neural Graph-Convolutions-based Mesh Registration (GCN-MeshReg) or classical segment-wise Non-Rigid Gravitational Approach (NRGA++) which
arXiv Detail & Related papers (2021-07-02T17:59:54Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z) - Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View
Geometry [62.29762409558553]
Epipolar constraints are at the core of feature matching and depth estimation in multi-person 3D human pose estimation methods.
Despite the satisfactory performance of this formulation in sparser crowd scenes, its effectiveness is frequently challenged under denser crowd circumstances.
In this paper, we depart from the multi-person 3D pose estimation formulation, and instead reformulate it as crowd pose estimation.
arXiv Detail & Related papers (2020-07-21T17:59:36Z) - HDNet: Human Depth Estimation for Multi-Person Camera-Space Localization [83.57863764231655]
We propose the Human Depth Estimation Network (HDNet), an end-to-end framework for absolute root joint localization.
A skeleton-based Graph Neural Network (GNN) is utilized to propagate features among joints.
We evaluate our HDNet on the root joint localization and root-relative 3D pose estimation tasks with two benchmark datasets.
arXiv Detail & Related papers (2020-07-17T12:44:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.