Related papers: UniPose: Unified Human Pose Estimation in Single Images and Videos

UniPose: Unified Human Pose Estimation in Single Images and Videos

URL: http://arxiv.org/abs/2001.08095v1
Date: Wed, 22 Jan 2020 15:59:42 GMT
Title: UniPose: Unified Human Pose Estimation in Single Images and Videos
Authors: Bruno Artacho and Andreas Savakis
Abstract summary: We propose a unified framework for human pose estimation, based on our "Waterfall" Atrous Spatial Pooling architecture. UniPose incorporates contextual segmentation and joint localization to estimate the human pose in a single stage. Our results on multiple datasets demonstrate that UniPose, with a ResNet backbone and Waterfall module, is a robust and efficient architecture for pose estimation.
Score: 3.04585143845864
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose UniPose, a unified framework for human pose estimation, based on our "Waterfall" Atrous Spatial Pooling architecture, that achieves state-of-art-results on several pose estimation metrics. Current pose estimation methods utilizing standard CNN architectures heavily rely on statistical postprocessing or predefined anchor poses for joint localization. UniPose incorporates contextual segmentation and joint localization to estimate the human pose in a single stage, with high accuracy, without relying on statistical postprocessing methods. The Waterfall module in UniPose leverages the efficiency of progressive filtering in the cascade architecture, while maintaining multi-scale fields-of-view comparable to spatial pyramid configurations. Additionally, our method is extended to UniPose-LSTM for multi-frame processing and achieves state-of-the-art results for temporal pose estimation in Video. Our results on multiple datasets demonstrate that UniPose, with a ResNet backbone and Waterfall module, is a robust and efficient architecture for pose estimation obtaining state-of-the-art results in single person pose detection for both single images and videos.

Related papers

Co-op: Correspondence-based Novel Object Pose Estimation [14.598853174946656]
Co-op is a novel method for accurately and robustly estimating the 6DoF pose of objects unseen during training from a single RGB image. Our method requires only the CAD model of the target object and can precisely estimate its pose without any additional fine-tuning.
arXiv Detail & Related papers (2025-03-22T11:24:19Z)
UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image [86.7128543480229]
We present a novel approach and benchmark, termed UNOPose, for unseen one-reference-based object pose estimation. Building upon a coarse-to-fine paradigm, UNOPose constructs an SE(3)-invariant reference frame to standardize object representation. We recalibrate the weight of each correspondence based on its predicted likelihood of being within the overlapping region.
arXiv Detail & Related papers (2024-11-25T05:36:00Z)
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [55.77542145604758]
FoundationPose is a unified foundation model for 6D object pose estimation and tracking. Our approach can be instantly applied at test-time to a novel object without fine-tuning.
arXiv Detail & Related papers (2023-12-13T18:28:09Z)
YOLOPose V2: Understanding and Improving Transformer-based 6D Pose Estimation [36.067414358144816]
YOLOPose is a Transformer-based multi-object 6D pose estimation method. We employ a learnable orientation estimation module to predict the orientation from the keypoints. Our method is suitable for real-time applications and achieves results comparable to state-of-the-art methods.
arXiv Detail & Related papers (2023-07-21T12:53:54Z)
RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild [73.1276968007689]
We describe a data-driven method for inferring the camera viewpoints given multiple images of an arbitrary object. We show that our approach outperforms state-of-the-art SfM and SLAM methods given sparse images on both seen and unseen categories.
arXiv Detail & Related papers (2022-08-11T17:59:59Z)
BAPose: Bottom-Up Pose Estimation with Disentangled Waterfall Representations [3.8073142980733]
BAPose is a novel framework that achieves state-of-the-art results for multi-person pose estimation. Our results on the challenging COCO and CrowdPose datasets demonstrate that BAPose is an efficient and robust framework.
arXiv Detail & Related papers (2021-12-20T18:07:09Z)
Direct Multi-view Multi-person 3D Pose Estimation [138.48139701871213]
We present Multi-view Pose transformer (MvP) for estimating multi-person 3D poses from multi-view images. MvP directly regresses the multi-person 3D poses in a clean and efficient way, without relying on intermediate tasks. We show experimentally that our MvP model outperforms the state-of-the-art methods on several benchmarks while being much more efficient.
arXiv Detail & Related papers (2021-11-07T13:09:20Z)
PoseDet: Fast Multi-Person Pose Estimation Using Pose Embedding [16.57620683425904]
This paper presents a novel framework PoseDet (Estimating Pose by Detection) to localize and associate body joints simultaneously. We also propose the keypoint-aware pose embedding to represent an object in terms of the locations of its keypoints. This simple framework achieves an unprecedented speed and a competitive accuracy on the COCO benchmark compared with state-of-the-art methods.
arXiv Detail & Related papers (2021-07-22T05:54:00Z)
OmniPose: A Multi-Scale Framework for Multi-Person Pose Estimation [3.8073142980733]
We propose a single-pass, end-to-end trainable framework that achieves state-of-the-art results for multi-person pose estimation. Our results on multiple datasets demonstrate that OmniPose is a robust and efficient architecture for multi-person pose estimation.
arXiv Detail & Related papers (2021-03-18T11:30:31Z)
Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision. Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z)
Deep Keypoint-Based Camera Pose Estimation with Geometric Constraints [80.60538408386016]
Estimating relative camera poses from consecutive frames is a fundamental problem in visual odometry. We propose an end-to-end trainable framework consisting of learnable modules for detection, feature extraction, matching and outlier rejection.
arXiv Detail & Related papers (2020-07-29T21:41:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.