PoseDet: Fast Multi-Person Pose Estimation Using Pose Embedding
- URL: http://arxiv.org/abs/2107.10466v1
- Date: Thu, 22 Jul 2021 05:54:00 GMT
- Title: PoseDet: Fast Multi-Person Pose Estimation Using Pose Embedding
- Authors: Chenyu Tian, Ran Yu, Xinyuan Zhao, Weihao Xia, Yujiu Yang, Haoqian
Wang
- Abstract summary: This paper presents a novel framework PoseDet (Estimating Pose by Detection) to localize and associate body joints simultaneously.
We also propose the keypoint-aware pose embedding to represent an object in terms of the locations of its keypoints.
This simple framework achieves an unprecedented speed and a competitive accuracy on the COCO benchmark compared with state-of-the-art methods.
- Score: 16.57620683425904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current methods of multi-person pose estimation typically treat the
localization and the association of body joints separately. It is convenient
but inefficient, leading to additional computation and a waste of time. This
paper, however, presents a novel framework PoseDet (Estimating Pose by
Detection) to localize and associate body joints simultaneously at higher
inference speed. Moreover, we propose the keypoint-aware pose embedding to
represent an object in terms of the locations of its keypoints. The proposed
pose embedding contains semantic and geometric information, allowing us to
access discriminative and informative features efficiently. It is utilized for
candidate classification and body joint localization in PoseDet, leading to
robust predictions of various poses. This simple framework achieves an
unprecedented speed and a competitive accuracy on the COCO benchmark compared
with state-of-the-art methods. Extensive experiments on the CrowdPose benchmark
show the robustness in the crowd scenes. Source code is available.
Related papers
- SRPose: Two-view Relative Pose Estimation with Sparse Keypoints [51.49105161103385]
SRPose is a sparse keypoint-based framework for two-view relative pose estimation in camera-to-world and object-to-camera scenarios.
It achieves competitive or superior performance compared to state-of-the-art methods in terms of accuracy and speed.
It is robust to different image sizes and camera intrinsics, and can be deployed with low computing resources.
arXiv Detail & Related papers (2024-07-11T05:46:35Z) - PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching [51.142988196855484]
We propose PoseMatcher, an accurate model free one-shot object pose estimator.
We create a new training pipeline for object to image matching based on a three-view system.
To enable PoseMatcher to attend to distinct input modalities, an image and a pointcloud, we introduce IO-Layer.
arXiv Detail & Related papers (2023-04-03T21:14:59Z) - Mutual Information-Based Temporal Difference Learning for Human Pose
Estimation in Video [16.32910684198013]
We present a novel multi-frame human pose estimation framework, which employs temporal differences across frames to model dynamic contexts.
To be specific, we design a multi-stage entangled learning sequences conditioned on multi-stage differences to derive informative motion representation sequences.
These place us to rank No.1 in the Crowd Pose Estimation in Complex Events Challenge on benchmark HiEve.
arXiv Detail & Related papers (2023-03-15T09:29:03Z) - AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking
in Real-Time [47.19339667836196]
We present AlphaPose, a system that can perform accurate whole-body pose estimation and tracking jointly while running in realtime.
We show a significant improvement over current state-of-the-art methods in both speed and accuracy on COCO-wholebody, COCO, PoseTrack, and our proposed Halpe-FullBody pose estimation dataset.
arXiv Detail & Related papers (2022-11-07T09:15:38Z) - Temporal Feature Alignment and Mutual Information Maximization for
Video-Based Human Pose Estimation [38.571715193347366]
We present a novel hierarchical alignment framework for multi-frame human pose estimation.
We rank No.1 in the Multi-frame Person Pose Estimation Challenge on benchmark dataset PoseTrack 2017, and obtain state-of-the-art performance on benchmarks Sub-JHMDB and Pose-Track 2018.
arXiv Detail & Related papers (2022-03-29T04:29:16Z) - Improving Robustness and Accuracy via Relative Information Encoding in
3D Human Pose Estimation [59.94032196768748]
We propose a relative information encoding method that yields positional and temporal enhanced representations.
Our method outperforms state-of-the-art methods on two public datasets.
arXiv Detail & Related papers (2021-07-29T14:12:19Z) - Deep Dual Consecutive Network for Human Pose Estimation [44.41818683253614]
We propose a novel multi-frame human pose estimation framework, leveraging abundant temporal cues between video frames to facilitate keypoint detection.
Our method ranks No.1 in the Multi-frame Person Pose Challenge Challenge on the large-scale benchmark datasets PoseTrack 2017 and PoseTrack 2018.
arXiv Detail & Related papers (2021-03-12T13:11:27Z) - OpenPifPaf: Composite Fields for Semantic Keypoint Detection and
Spatio-Temporal Association [90.39247595214998]
Image-based perception tasks can be formulated as detecting, associating and semantic keypoints, e.g. human body pose estimation and tracking.
We present a general framework that jointly detects semantic andtemporal keypoint associations in a single stage.
We also show that our method generalizes to any class of keypoints such as car and animal parts to provide a holistic perception framework.
arXiv Detail & Related papers (2021-03-03T14:44:14Z) - Reference Pose Generation for Long-term Visual Localization via Learned
Features and View Synthesis [88.80710311624101]
We propose a semi-automated approach to generate reference poses based on feature matching between renderings of a 3D model and real images via learned features.
We significantly improve the nighttime reference poses of the popular Aachen Day-Night dataset, showing that state-of-the-art visual localization methods perform better (up to $47%$) than predicted by the original reference poses.
arXiv Detail & Related papers (2020-05-11T15:13:07Z) - UniPose: Unified Human Pose Estimation in Single Images and Videos [3.04585143845864]
We propose a unified framework for human pose estimation, based on our "Waterfall" Atrous Spatial Pooling architecture.
UniPose incorporates contextual segmentation and joint localization to estimate the human pose in a single stage.
Our results on multiple datasets demonstrate that UniPose, with a ResNet backbone and Waterfall module, is a robust and efficient architecture for pose estimation.
arXiv Detail & Related papers (2020-01-22T15:59:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.