Related papers: HTNet: Human Topology Aware Network for 3D Human Pose Estimation

HTNet: Human Topology Aware Network for 3D Human Pose Estimation

URL: http://arxiv.org/abs/2302.09790v1
Date: Mon, 20 Feb 2023 06:31:29 GMT
Title: HTNet: Human Topology Aware Network for 3D Human Pose Estimation
Authors: Jialun Cai, Hong Liu, Runwei Ding, Wenhao Li, Jianbing Wu, Miaoju Ban
Abstract summary: 3D human pose estimation errors would propagate along the human body topology and accumulate at the end joints of limbs. We design an Intra-Part Constraint module that utilizes the parent nodes as the reference to build topological constraints for end joints at the part level. We propose a novel Human Topology aware Network (HTNet), which adopts a channel-split progressive strategy to sequentially learn the structural priors of the human topology.
Score: 12.120648336697592
License: http://creativecommons.org/licenses/by/4.0/
Abstract: 3D human pose estimation errors would propagate along the human body topology and accumulate at the end joints of limbs. Inspired by the backtracking mechanism in automatic control systems, we design an Intra-Part Constraint module that utilizes the parent nodes as the reference to build topological constraints for end joints at the part level. Further considering the hierarchy of the human topology, joint-level and body-level dependencies are captured via graph convolutional networks and self-attentions, respectively. Based on these designs, we propose a novel Human Topology aware Network (HTNet), which adopts a channel-split progressive strategy to sequentially learn the structural priors of the human topology from multiple semantic levels: joint, part, and body. Extensive experiments show that the proposed method improves the estimation accuracy by 18.7% on the end joints of limbs and achieves state-of-the-art results on Human3.6M and MPI-INF-3DHP datasets. Code is available at https://github.com/vefalun/HTNet.

Related papers

GRACE: Estimating Geometry-level 3D Human-Scene Contact from 2D Images [54.602947113980655]
Estimating the geometry level of human-scene contact aims to ground specific contact surface points at 3D human geometries.<n> GRACE (Geometry-level Reasoning for 3D Human-scene Contact Estimation) is a new paradigm for 3D human contact estimation.<n>It incorporates a point cloud encoder-decoder architecture along with a hierarchical feature extraction and fusion module.
arXiv Detail & Related papers (2025-05-10T09:25:46Z)
NanoHTNet: Nano Human Topology Network for Efficient 3D Human Pose Estimation [24.059039655555807]
3D human pose estimation (HPE) is limited by resource-constrained edge devices. We propose a Nano Human Topology Network (NanoHTNet) to capture explicit features. We also propose PoseCLR to align 2D poses from diverse viewpoints in a proxy task.
arXiv Detail & Related papers (2025-01-27T04:16:42Z)
StackFLOW: Monocular Human-Object Reconstruction by Stacked Normalizing Flow with Offset [56.71580976007712]
We propose to use the Human-Object Offset between anchors which are densely sampled from the surface of human mesh and object mesh to represent human-object spatial relation. Based on this representation, we propose Stacked Normalizing Flow (StackFLOW) to infer the posterior distribution of human-object spatial relations from the image. During the optimization stage, we finetune the human body pose and object 6D pose by maximizing the likelihood of samples.
arXiv Detail & Related papers (2024-07-30T04:57:21Z)
Disentangled Diffusion-Based 3D Human Pose Estimation with Hierarchical Spatial and Temporal Denoiser [8.759087891756069]
A Disentangled Diffusion-based 3D Human Pose Estimation method with Hierarchical Spatial and Temporal Denoiser is proposed, termed DDHPose.<n>We disentangle the 3D pose and diffuse the bone length and bone direction during the forward process of the diffusion model.<n>For the reverse process, we propose Hierarchical Spatial and Temporal Denoiser to improve the hierarchical modeling of each joint.
arXiv Detail & Related papers (2024-03-07T12:20:13Z)
Iterative Graph Filtering Network for 3D Human Pose Estimation [5.177947445379688]
Graph convolutional networks (GCNs) have proven to be an effective approach for 3D human pose estimation. In this paper, we introduce an iterative graph filtering framework for 3D human pose estimation. Our approach builds upon the idea of iteratively solving graph filtering with Laplacian regularization.
arXiv Detail & Related papers (2023-07-29T20:46:44Z)
Back to MLP: A Simple Baseline for Human Motion Prediction [59.18776744541904]
This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences. We show that the performance of these approaches can be surpassed by a light-weight and purely architectural architecture with only 0.14M parameters. An exhaustive evaluation on Human3.6M, AMASS and 3DPW datasets shows that our method, which we dub siMLPe, consistently outperforms all other approaches.
arXiv Detail & Related papers (2022-07-04T16:35:58Z)
Hierarchical Graph Networks for 3D Human Pose Estimation [50.600944798627786]
Recent 2D-to-3D human pose estimation works tend to utilize the graph structure formed by the topology of the human skeleton. We argue that this skeletal topology is too sparse to reflect the body structure and suffer from serious 2D-to-3D ambiguity problem. We propose a novel graph convolution network architecture, Hierarchical Graph Networks, to overcome these weaknesses.
arXiv Detail & Related papers (2021-11-23T15:09:03Z)
Higher-Order Implicit Fairing Networks for 3D Human Pose Estimation [1.1501261942096426]
We introduce a higher-order graph convolutional framework with initial residual connections for 2D-to-3D pose estimation. Our model is able to capture the long-range dependencies between body joints. Experiments and ablations studies conducted on two standard benchmarks demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2021-11-01T13:48:55Z)
Learning Transferable Kinematic Dictionary for 3D Human Pose and Shape Reconstruction [15.586347115568973]
We propose a kinematic dictionary, which explicitly regularizes the solution space of relative 3D rotations of human joints. Our method achieves end-to-end 3D reconstruction without the need of using any shape annotations during the training of neural networks. The proposed method achieves competitive results on large-scale datasets including Human3.6M, MPI-INF-3DHP, and LSP.
arXiv Detail & Related papers (2021-04-02T09:24:29Z)
HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation [54.23770284299979]
This paper introduces a novel form of supervision - Hierarchical Multi-person Ordinal Relations (HMOR) HMOR encodes interaction information as the ordinal relations of depths and angles hierarchically. An integrated top-down model is designed to leverage these ordinal relations in the learning process. The proposed method significantly outperforms state-of-the-art methods on publicly available multi-person 3D pose datasets.
arXiv Detail & Related papers (2020-08-01T07:53:27Z)
HDNet: Human Depth Estimation for Multi-Person Camera-Space Localization [83.57863764231655]
We propose the Human Depth Estimation Network (HDNet), an end-to-end framework for absolute root joint localization. A skeleton-based Graph Neural Network (GNN) is utilized to propagate features among joints. We evaluate our HDNet on the root joint localization and root-relative 3D pose estimation tasks with two benchmark datasets.
arXiv Detail & Related papers (2020-07-17T12:44:23Z)
Anatomy-aware 3D Human Pose Estimation with Bone-based Pose Decomposition [92.99291528676021]
Instead of directly regressing the 3D joint locations, we decompose the task into bone direction prediction and bone length prediction. Our motivation is the fact that the bone lengths of a human skeleton remain consistent across time. Our full model outperforms the previous best results on Human3.6M and MPI-INF-3DHP datasets.
arXiv Detail & Related papers (2020-02-24T15:49:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.