Pose Prior Learner: Unsupervised Categorical Prior Learning for Pose Estimation
- URL: http://arxiv.org/abs/2410.03858v2
- Date: Thu, 20 Feb 2025 04:13:04 GMT
- Title: Pose Prior Learner: Unsupervised Categorical Prior Learning for Pose Estimation
- Authors: Ziyu Wang, Shuangpeng Han, Mengmi Zhang,
- Abstract summary: A prior represents a set of beliefs or assumptions about a system.
In this paper, we introduce the challenge of unsupervised categorical prior learning in pose estimation.
We propose a novel method, named Pose Prior Learner (PPL), to learn a general pose prior for any object category.
- Score: 6.359236783105098
- License:
- Abstract: A prior represents a set of beliefs or assumptions about a system, aiding inference and decision-making. In this paper, we introduce the challenge of unsupervised categorical prior learning in pose estimation, where AI models learn a general pose prior for an object category from images in a self-supervised manner. Although priors are effective in estimating pose, acquiring them can be difficult. We propose a novel method, named Pose Prior Learner (PPL), to learn a general pose prior for any object category. PPL uses a hierarchical memory to store compositional parts of prototypical poses, from which we distill a general pose prior. This prior improves pose estimation accuracy through template transformation and image reconstruction. PPL learns meaningful pose priors without any additional human annotations or interventions, outperforming competitive baselines on both human and animal pose estimation datasets. Notably, our experimental results reveal the effectiveness of PPL using learned prototypical poses for pose estimation on occluded images. Through iterative inference, PPL leverages the pose prior to refine estimated poses, regressing them to any prototypical poses stored in memory. Our code, model, and data will be publicly available.
Related papers
- GRPose: Learning Graph Relations for Human Image Generation with Pose Priors [21.91374799527015]
We propose a framework that delves into the graph relations of pose priors to provide control information for human image generation.
The main idea is to establish a graph topological structure between the pose priors and latent representation of diffusion models.
A pose perception loss is introduced based on a pretrained pose estimation network to minimize the pose differences.
arXiv Detail & Related papers (2024-08-29T13:58:34Z) - Learning a Category-level Object Pose Estimator without Pose Annotations [37.03715008347576]
We propose to learn a category-level 3D object pose estimator without pose annotations.
Instead of using manually annotated images, we leverage diffusion models to generate a set of images under controlled pose differences.
We show that our method has the capability of category-level object pose estimation from a single shot setting.
arXiv Detail & Related papers (2024-04-08T15:59:29Z) - Understanding Pose and Appearance Disentanglement in 3D Human Pose
Estimation [72.50214227616728]
Several methods have proposed to learn image representations in a self-supervised fashion so as to disentangle the appearance information from the pose one.
We study disentanglement from the perspective of the self-supervised network, via diverse image synthesis experiments.
We design an adversarial strategy focusing on generating natural appearance changes of the subject, and against which we could expect a disentangled network to be robust.
arXiv Detail & Related papers (2023-09-20T22:22:21Z) - Generalizable Pose Estimation Using Implicit Scene Representations [4.124185654280966]
6-DoF pose estimation is an essential component of robotic manipulation pipelines.
We address the generalization capability of pose estimation using models that contain enough information to render it in different poses.
Our final evaluation shows a significant improvement in inference performance and speed compared to existing approaches.
arXiv Detail & Related papers (2023-05-26T20:42:52Z) - PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching [51.142988196855484]
We propose PoseMatcher, an accurate model free one-shot object pose estimator.
We create a new training pipeline for object to image matching based on a three-view system.
To enable PoseMatcher to attend to distinct input modalities, an image and a pointcloud, we introduce IO-Layer.
arXiv Detail & Related papers (2023-04-03T21:14:59Z) - Few-View Object Reconstruction with Unknown Categories and Camera Poses [80.0820650171476]
This work explores reconstructing general real-world objects from a few images without known camera poses or object categories.
The crux of our work is solving two fundamental 3D vision problems -- shape reconstruction and pose estimation.
Our method FORGE predicts 3D features from each view and leverages them in conjunction with the input images to establish cross-view correspondence.
arXiv Detail & Related papers (2022-12-08T18:59:02Z) - Learning Dynamics via Graph Neural Networks for Human Pose Estimation
and Tracking [98.91894395941766]
We propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame.
Specifically, we derive this prediction of dynamics through a graph neural network(GNN) that explicitly accounts for both spatial-temporal and visual information.
Experiments on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed method achieves results superior to the state of the art on both human pose estimation and tracking tasks.
arXiv Detail & Related papers (2021-06-07T16:36:50Z) - Towards Accurate Human Pose Estimation in Videos of Crowded Scenes [134.60638597115872]
We focus on improving human pose estimation in videos of crowded scenes from the perspectives of exploiting temporal context and collecting new data.
For one frame, we forward the historical poses from the previous frames and backward the future poses from the subsequent frames to current frame, leading to stable and accurate human pose estimation in videos.
In this way, our model achieves best performance on 7 out of 13 videos and 56.33 average w_AP on test dataset of HIE challenge.
arXiv Detail & Related papers (2020-10-16T13:19:11Z) - Camera Pose Matters: Improving Depth Prediction by Mitigating Pose
Distribution Bias [12.354076490479516]
We propose two novel techniques that exploit the camera pose during training and prediction.
First, we introduce a simple perspective-aware data augmentation that synthesizes new training examples with more diverse views.
Second, we propose a conditional model that exploits the per-image camera pose as prior knowledge by encoding it as a part of the input.
arXiv Detail & Related papers (2020-07-08T04:14:17Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z) - MirrorNet: A Deep Bayesian Approach to Reflective 2D Pose Estimation
from Human Images [42.27703025887059]
The main problems with the standard supervised approach are that it often yields anatomically implausible poses.
We propose a semi-supervised method that can make effective use of images with and without pose annotations.
The results of experiments show that the proposed reflective architecture makes estimated poses anatomically plausible.
arXiv Detail & Related papers (2020-04-08T05:02:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.