UniPose: Unified Human Pose Estimation in Single Images and Videos
- URL: http://arxiv.org/abs/2001.08095v1
- Date: Wed, 22 Jan 2020 15:59:42 GMT
- Title: UniPose: Unified Human Pose Estimation in Single Images and Videos
- Authors: Bruno Artacho and Andreas Savakis
- Abstract summary: We propose a unified framework for human pose estimation, based on our "Waterfall" Atrous Spatial Pooling architecture.
UniPose incorporates contextual segmentation and joint localization to estimate the human pose in a single stage.
Our results on multiple datasets demonstrate that UniPose, with a ResNet backbone and Waterfall module, is a robust and efficient architecture for pose estimation.
- Score: 3.04585143845864
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose UniPose, a unified framework for human pose estimation, based on
our "Waterfall" Atrous Spatial Pooling architecture, that achieves
state-of-art-results on several pose estimation metrics. Current pose
estimation methods utilizing standard CNN architectures heavily rely on
statistical postprocessing or predefined anchor poses for joint localization.
UniPose incorporates contextual segmentation and joint localization to estimate
the human pose in a single stage, with high accuracy, without relying on
statistical postprocessing methods. The Waterfall module in UniPose leverages
the efficiency of progressive filtering in the cascade architecture, while
maintaining multi-scale fields-of-view comparable to spatial pyramid
configurations. Additionally, our method is extended to UniPose-LSTM for
multi-frame processing and achieves state-of-the-art results for temporal pose
estimation in Video. Our results on multiple datasets demonstrate that UniPose,
with a ResNet backbone and Waterfall module, is a robust and efficient
architecture for pose estimation obtaining state-of-the-art results in single
person pose detection for both single images and videos.
Related papers
- UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image [86.7128543480229]
We present a novel approach and benchmark, termed UNOPose, for unseen one-reference-based object pose estimation.
Building upon a coarse-to-fine paradigm, UNOPose constructs an SE(3)-invariant reference frame to standardize object representation.
We recalibrate the weight of each correspondence based on its predicted likelihood of being within the overlapping region.
arXiv Detail & Related papers (2024-11-25T05:36:00Z) - FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [55.77542145604758]
FoundationPose is a unified foundation model for 6D object pose estimation and tracking.
Our approach can be instantly applied at test-time to a novel object without fine-tuning.
arXiv Detail & Related papers (2023-12-13T18:28:09Z) - YOLOPose V2: Understanding and Improving Transformer-based 6D Pose
Estimation [36.067414358144816]
YOLOPose is a Transformer-based multi-object 6D pose estimation method.
We employ a learnable orientation estimation module to predict the orientation from the keypoints.
Our method is suitable for real-time applications and achieves results comparable to state-of-the-art methods.
arXiv Detail & Related papers (2023-07-21T12:53:54Z) - RelPose: Predicting Probabilistic Relative Rotation for Single Objects
in the Wild [73.1276968007689]
We describe a data-driven method for inferring the camera viewpoints given multiple images of an arbitrary object.
We show that our approach outperforms state-of-the-art SfM and SLAM methods given sparse images on both seen and unseen categories.
arXiv Detail & Related papers (2022-08-11T17:59:59Z) - BAPose: Bottom-Up Pose Estimation with Disentangled Waterfall
Representations [3.8073142980733]
BAPose is a novel framework that achieves state-of-the-art results for multi-person pose estimation.
Our results on the challenging COCO and CrowdPose datasets demonstrate that BAPose is an efficient and robust framework.
arXiv Detail & Related papers (2021-12-20T18:07:09Z) - Direct Multi-view Multi-person 3D Pose Estimation [138.48139701871213]
We present Multi-view Pose transformer (MvP) for estimating multi-person 3D poses from multi-view images.
MvP directly regresses the multi-person 3D poses in a clean and efficient way, without relying on intermediate tasks.
We show experimentally that our MvP model outperforms the state-of-the-art methods on several benchmarks while being much more efficient.
arXiv Detail & Related papers (2021-11-07T13:09:20Z) - PoseDet: Fast Multi-Person Pose Estimation Using Pose Embedding [16.57620683425904]
This paper presents a novel framework PoseDet (Estimating Pose by Detection) to localize and associate body joints simultaneously.
We also propose the keypoint-aware pose embedding to represent an object in terms of the locations of its keypoints.
This simple framework achieves an unprecedented speed and a competitive accuracy on the COCO benchmark compared with state-of-the-art methods.
arXiv Detail & Related papers (2021-07-22T05:54:00Z) - OmniPose: A Multi-Scale Framework for Multi-Person Pose Estimation [3.8073142980733]
We propose a single-pass, end-to-end trainable framework that achieves state-of-the-art results for multi-person pose estimation.
Our results on multiple datasets demonstrate that OmniPose is a robust and efficient architecture for multi-person pose estimation.
arXiv Detail & Related papers (2021-03-18T11:30:31Z) - Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection
Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision.
Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z) - Deep Keypoint-Based Camera Pose Estimation with Geometric Constraints [80.60538408386016]
Estimating relative camera poses from consecutive frames is a fundamental problem in visual odometry.
We propose an end-to-end trainable framework consisting of learnable modules for detection, feature extraction, matching and outlier rejection.
arXiv Detail & Related papers (2020-07-29T21:41:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.