Related papers: Efficient, Self-Supervised Human Pose Estimation with Inductive Prior Tuning

Efficient, Self-Supervised Human Pose Estimation with Inductive Prior Tuning

URL: http://arxiv.org/abs/2311.02815v1
Date: Mon, 6 Nov 2023 01:19:57 GMT
Title: Efficient, Self-Supervised Human Pose Estimation with Inductive Prior Tuning
Authors: Nobline Yoo, Olga Russakovsky
Abstract summary: We analyze the relationship between reconstruction quality and pose estimation accuracy. We develop a model pipeline that outperforms the baseline, using less than one-third the amount of training data. We show that a combination of well-engineered reconstruction losses and inductive priors can help coordinate pose learning alongside reconstruction.
Score: 30.256493625913127
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The goal of 2D human pose estimation (HPE) is to localize anatomical landmarks, given an image of a person in a pose. SOTA techniques make use of thousands of labeled figures (finetuning transformers or training deep CNNs), acquired using labor-intensive crowdsourcing. On the other hand, self-supervised methods re-frame the HPE task as a reconstruction problem, enabling them to leverage the vast amount of unlabeled visual data, though at the present cost of accuracy. In this work, we explore ways to improve self-supervised HPE. We (1) analyze the relationship between reconstruction quality and pose estimation accuracy, (2) develop a model pipeline that outperforms the baseline which inspired our work, using less than one-third the amount of training data, and (3) offer a new metric suitable for self-supervised settings that measures the consistency of predicted body part length proportions. We show that a combination of well-engineered reconstruction losses and inductive priors can help coordinate pose learning alongside reconstruction in a self-supervised paradigm.

Related papers

Contact-Aware Refinement of Human Pose Pseudo-Ground Truth via Bioimpedance Sensing [42.371736670824575]
We propose a novel framework that combines visual pose estimators with bioimpedance sensing to capture the 3D pose of people by taking self-contact into account.<n>We validate our approach using a new dataset of synchronized RGB video, bioimpedance measurements, and 3D motion capture.<n>We also present a miniature wearable bioimpedance sensor that enables efficient large-scale collection of contact-aware training data.
arXiv Detail & Related papers (2025-12-04T14:45:38Z)
3D WholeBody Pose Estimation based on Semantic Graph Attention Network and Distance Information [2.457872341625575]
A novel Semantic Graph Attention Network can benefit from the ability of self-attention to capture global context. A Body Part Decoder assists in extracting and refining the information related to specific segments of the body. A Geometry Loss makes a critical constraint on the structural skeleton of the body, ensuring that the model's predictions adhere to the natural limits of human posture.
arXiv Detail & Related papers (2024-06-03T10:59:00Z)
Non-Local Latent Relation Distillation for Self-Adaptive 3D Human Pose Estimation [63.199549837604444]
3D human pose estimation approaches leverage different forms of strong (2D/3D pose) or weak (multi-view or depth) paired supervision. We cast 3D pose learning as a self-supervised adaptation problem that aims to transfer the task knowledge from a labeled source domain to a completely unpaired target. We evaluate different self-adaptation settings and demonstrate state-of-the-art 3D human pose estimation performance on standard benchmarks.
arXiv Detail & Related papers (2022-04-05T03:52:57Z)
PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision [102.48681650013698]
Existing self-supervised 3D human pose estimation schemes have largely relied on weak supervisions to guide the learning. We propose a novel self-supervised approach that allows us to explicitly generate 2D-3D pose pairs for augmenting supervision. This is made possible via introducing a reinforcement-learning-based imitator, which is learned jointly with a pose estimator alongside a pose hallucinator.
arXiv Detail & Related papers (2022-03-29T14:45:53Z)
Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations. We derive suitable measures to quantify prediction uncertainty at both pose and joint level. We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z)
Higher-Order Implicit Fairing Networks for 3D Human Pose Estimation [1.1501261942096426]
We introduce a higher-order graph convolutional framework with initial residual connections for 2D-to-3D pose estimation. Our model is able to capture the long-range dependencies between body joints. Experiments and ablations studies conducted on two standard benchmarks demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2021-11-01T13:48:55Z)
THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers [67.8628917474705]
THUNDR is a transformer-based deep neural network methodology to reconstruct the 3d pose and shape of people. We show state-of-the-art results on Human3.6M and 3DPW, for both the fully-supervised and the self-supervised models. We observe very solid 3d reconstruction performance for difficult human poses collected in the wild.
arXiv Detail & Related papers (2021-06-17T09:09:24Z)
Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image. We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end. Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z)
Kinematic-Structure-Preserved Representation for Unsupervised 3D Human Pose Estimation [58.72192168935338]
Generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable. We propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions. Our proposed model employs three consecutive differentiable transformations named as forward-kinematics, camera-projection and spatial-map transformation.
arXiv Detail & Related papers (2020-06-24T23:56:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.