Self-supervision on Unlabelled OR Data for Multi-person 2D/3D Human Pose
Estimation
- URL: http://arxiv.org/abs/2007.08354v2
- Date: Fri, 20 Aug 2021 10:53:38 GMT
- Title: Self-supervision on Unlabelled OR Data for Multi-person 2D/3D Human Pose
Estimation
- Authors: Vinkle Srivastav, Afshin Gangi, Nicolas Padoy
- Abstract summary: 2D/3D human pose estimation is needed to develop novel intelligent tools for the operating room.
We propose to use knowledge distillation in a teacher/student framework to harness the knowledge present in a large-scale non-annotated dataset.
The easily deployable network trained using this effective self-supervision strategy performs on par with the teacher network on emphMVOR+, an extension of the public MVOR dataset.
- Score: 2.8802646903517957
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: 2D/3D human pose estimation is needed to develop novel intelligent tools for
the operating room that can analyze and support the clinical activities. The
lack of annotated data and the complexity of state-of-the-art pose estimation
approaches limit, however, the deployment of such techniques inside the OR. In
this work, we propose to use knowledge distillation in a teacher/student
framework to harness the knowledge present in a large-scale non-annotated
dataset and in an accurate but complex multi-stage teacher network to train a
lightweight network for joint 2D/3D pose estimation. The teacher network also
exploits the unlabeled data to generate both hard and soft labels useful in
improving the student predictions. The easily deployable network trained using
this effective self-supervision strategy performs on par with the teacher
network on \emph{MVOR+}, an extension of the public MVOR dataset where all
persons have been fully annotated, thus providing a viable solution for
real-time 2D/3D human pose estimation in the OR.
Related papers
- Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance [11.090775523892074]
We introduce a novel semi-supervised framework to alleviate the dependency on densely annotated data.
Our approach leverages 2D foundation models to generate essential 3D scene geometric and semantic cues.
Our method achieves up to 85% of the fully-supervised performance using only 10% labeled data.
arXiv Detail & Related papers (2024-08-21T12:13:18Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Dual-Perspective Knowledge Enrichment for Semi-Supervised 3D Object
Detection [55.210991151015534]
We present a novel Dual-Perspective Knowledge Enrichment approach named DPKE for semi-supervised 3D object detection.
Our DPKE enriches the knowledge of limited training data, particularly unlabeled data, from two perspectives: data-perspective and feature-perspective.
arXiv Detail & Related papers (2024-01-10T08:56:07Z) - S$^2$Contact: Graph-based Network for 3D Hand-Object Contact Estimation
with Semi-Supervised Learning [70.72037296392642]
We propose a novel semi-supervised framework that allows us to learn contact from monocular images.
Specifically, we leverage visual and geometric consistency constraints in large-scale datasets for generating pseudo-labels.
We show benefits from using a contact map that rules hand-object interactions to produce more accurate reconstructions.
arXiv Detail & Related papers (2022-08-01T14:05:23Z) - KTN: Knowledge Transfer Network for Learning Multi-person 2D-3D
Correspondences [77.56222946832237]
We present a novel framework to detect the densepose of multiple people in an image.
The proposed method, which we refer to Knowledge Transfer Network (KTN), tackles two main problems.
It simultaneously maintains feature resolution and suppresses background pixels, and this strategy results in substantial increase in accuracy.
arXiv Detail & Related papers (2022-06-21T03:11:37Z) - Invariant Teacher and Equivariant Student for Unsupervised 3D Human Pose
Estimation [28.83582658618296]
We propose a novel method based on teacher-student learning framework for 3D human pose estimation.
Our method reduces the 3D joint prediction error by 11.4% compared to state-of-the-art unsupervised methods.
arXiv Detail & Related papers (2020-12-17T05:32:44Z) - Unsupervised Cross-Modal Alignment for Multi-Person 3D Pose Estimation [52.94078950641959]
We present a deployment friendly, fast bottom-up framework for multi-person 3D human pose estimation.
We adopt a novel neural representation of multi-person 3D pose which unifies the position of person instances with their corresponding 3D pose representation.
We propose a practical deployment paradigm where paired 2D or 3D pose annotations are unavailable.
arXiv Detail & Related papers (2020-08-04T07:54:25Z) - 3D Human Pose Estimation using Spatio-Temporal Networks with Explicit
Occlusion Training [40.933783830017035]
Estimating 3D poses from a monocular task is still a challenging task, despite the significant progress that has been made in recent years.
We introduce a-temporal video network for robust 3D human pose estimation.
We apply multi-scale spatial features for 2D joints or keypoints prediction in each individual frame, and multistride temporal convolutional net-works (TCNs) to estimate 3D joints or keypoints.
arXiv Detail & Related papers (2020-04-07T09:12:12Z) - Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the
Wild [101.70320427145388]
We propose a weakly-supervised approach that does not require 3D annotations and learns to estimate 3D poses from unlabeled multi-view data.
We evaluate our proposed approach on two large scale datasets.
arXiv Detail & Related papers (2020-03-17T08:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.