Human Pose as Compositional Tokens
- URL: http://arxiv.org/abs/2303.11638v1
- Date: Tue, 21 Mar 2023 07:14:18 GMT
- Title: Human Pose as Compositional Tokens
- Authors: Zigang Geng and Chunyu Wang and Yixuan Wei and Ze Liu and Houqiang Li
and Han Hu
- Abstract summary: We present a structured representation, named Pose as Compositional Tokens (PCT), to explore the joint dependency.
It represents a pose by M discrete tokens with each characterizing a sub-structure with several interdependent joints.
A pre-learned decoder network is used to recover the pose from the tokens without further post-processing.
- Score: 88.28348144244131
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human pose is typically represented by a coordinate vector of body joints or
their heatmap embeddings. While easy for data processing, unrealistic pose
estimates are admitted due to the lack of dependency modeling between the body
joints. In this paper, we present a structured representation, named Pose as
Compositional Tokens (PCT), to explore the joint dependency. It represents a
pose by M discrete tokens with each characterizing a sub-structure with several
interdependent joints. The compositional design enables it to achieve a small
reconstruction error at a low cost. Then we cast pose estimation as a
classification task. In particular, we learn a classifier to predict the
categories of the M tokens from an image. A pre-learned decoder network is used
to recover the pose from the tokens without further post-processing. We show
that it achieves better or comparable pose estimation results as the existing
methods in general scenarios, yet continues to work well when occlusion occurs,
which is ubiquitous in practice. The code and models are publicly available at
https://github.com/Gengzigang/PCT.
Related papers
- DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses [59.51874686414509]
Current approaches approximate the continuous pose representation with a large number of discrete pose hypotheses.
We present a Deep Voxel Matching Network (DVMNet) that eliminates the need for pose hypotheses and computes the relative object pose in a single pass.
Our method delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z) - FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [55.77542145604758]
FoundationPose is a unified foundation model for 6D object pose estimation and tracking.
Our approach can be instantly applied at test-time to a novel object without fine-tuning.
arXiv Detail & Related papers (2023-12-13T18:28:09Z) - Learning from Abstract Images: on the Importance of Occlusion in a
Minimalist Encoding of Human Poses [0.0]
2D-to-D representation suffers from poor performance in cross-dataset benchmarks.
We propose a novel representation using 3D information while encoding it.
The result allows us to predict poses that are completely independent of camera viewpoint.
arXiv Detail & Related papers (2023-07-19T10:45:49Z) - Generalizable Pose Estimation Using Implicit Scene Representations [4.124185654280966]
6-DoF pose estimation is an essential component of robotic manipulation pipelines.
We address the generalization capability of pose estimation using models that contain enough information to render it in different poses.
Our final evaluation shows a significant improvement in inference performance and speed compared to existing approaches.
arXiv Detail & Related papers (2023-05-26T20:42:52Z) - Category-Level Pose Retrieval with Contrastive Features Learnt with
Occlusion Augmentation [31.73423009695285]
We propose an approach to category-level pose estimation using a contrastive loss with a dynamic margin and a continuous pose-label space.
Our approach achieves state-of-the-art performance on PASCAL3D and OccludedPASCAL3D, as well as high-quality results on KITTI3D.
arXiv Detail & Related papers (2022-08-12T10:04:08Z) - Pose for Everything: Towards Category-Agnostic Pose Estimation [93.07415325374761]
Category-Agnostic Pose Estimation (CAPE) aims to create a pose estimation model capable of detecting the pose of any class of object given only a few samples with keypoint definition.
A transformer-based Keypoint Interaction Module (KIM) is proposed to capture both the interactions among different keypoints and the relationship between the support and query images.
We also introduce Multi-category Pose (MP-100) dataset, which is a 2D pose dataset of 100 object categories containing over 20K instances and is well-designed for developing CAPE algorithms.
arXiv Detail & Related papers (2022-07-21T09:40:54Z) - Decoupled Multi-task Learning with Cyclical Self-Regulation for Face
Parsing [71.19528222206088]
We propose a novel Decoupled Multi-task Learning with Cyclical Self-Regulation for face parsing.
Specifically, DML-CSR designs a multi-task model which comprises face parsing, binary edge, and category edge detection.
Our method achieves the new state-of-the-art performance on the Helen, CelebA-HQ, and LapaMask datasets.
arXiv Detail & Related papers (2022-03-28T02:12:30Z) - Hierarchical Neural Implicit Pose Network for Animation and Motion
Retargeting [66.69067601079706]
HIPNet is a neural implicit pose network trained on multiple subjects across many poses.
We employ a hierarchical skeleton-based representation to learn a signed distance function on a canonical unposed space.
We achieve state-of-the-art results on various single-subject and multi-subject benchmarks.
arXiv Detail & Related papers (2021-12-02T03:25:46Z) - Nonparametric Structure Regularization Machine for 2D Hand Pose
Estimation [21.250031729596085]
Hand pose estimation is more challenging than body pose estimation due to severe articulation, self-occlusion and high dexterity of the hand.
We propose a novel Nonparametric Structure Regularization Machine (NSRM) for 2D hand pose estimation, adopting a cascade multi-task architecture to learn hand structure and keypoint representations jointly.
arXiv Detail & Related papers (2020-01-24T03:27:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.