KTN: Knowledge Transfer Network for Learning Multi-person 2D-3D
Correspondences
- URL: http://arxiv.org/abs/2206.10090v1
- Date: Tue, 21 Jun 2022 03:11:37 GMT
- Title: KTN: Knowledge Transfer Network for Learning Multi-person 2D-3D
Correspondences
- Authors: Xuanhan Wang, Lianli Gao, Yixuan Zhou, Jingkuan Song, Meng Wang
- Abstract summary: We present a novel framework to detect the densepose of multiple people in an image.
The proposed method, which we refer to Knowledge Transfer Network (KTN), tackles two main problems.
It simultaneously maintains feature resolution and suppresses background pixels, and this strategy results in substantial increase in accuracy.
- Score: 77.56222946832237
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Human densepose estimation, aiming at establishing dense correspondences
between 2D pixels of human body and 3D human body template, is a key technique
in enabling machines to have an understanding of people in images. It still
poses several challenges due to practical scenarios where real-world scenes are
complex and only partial annotations are available, leading to incompelete or
false estimations. In this work, we present a novel framework to detect the
densepose of multiple people in an image. The proposed method, which we refer
to Knowledge Transfer Network (KTN), tackles two main problems: 1) how to
refine image representation for alleviating incomplete estimations, and 2) how
to reduce false estimation caused by the low-quality training labels (i.e.,
limited annotations and class-imbalance labels). Unlike existing works directly
propagating the pyramidal features of regions for densepose estimation, the KTN
uses a refinement of pyramidal representation, where it simultaneously
maintains feature resolution and suppresses background pixels, and this
strategy results in a substantial increase in accuracy. Moreover, the KTN
enhances the ability of 3D based body parsing with external knowledges, where
it casts 2D based body parsers trained from sufficient annotations as a 3D
based body parser through a structural body knowledge graph. In this way, it
significantly reduces the adverse effects caused by the low-quality
annotations. The effectiveness of KTN is demonstrated by its superior
performance to the state-of-the-art methods on DensePose-COCO dataset.
Extensive ablation studies and experimental results on representative tasks
(e.g., human body segmentation, human part segmentation and keypoints
detection) and two popular densepose estimation pipelines (i.e., RCNN and
fully-convolutional frameworks), further indicate the generalizability of the
proposed method.
Related papers
- Label-Efficient 3D Brain Segmentation via Complementary 2D Diffusion Models with Orthogonal Views [10.944692719150071]
We propose a novel 3D brain segmentation approach using complementary 2D diffusion models.
Our goal is to achieve reliable segmentation quality without requiring complete labels for each individual subject.
arXiv Detail & Related papers (2024-07-17T06:14:53Z) - 2D Feature Distillation for Weakly- and Semi-Supervised 3D Semantic
Segmentation [92.17700318483745]
We propose an image-guidance network (IGNet) which builds upon the idea of distilling high level feature information from a domain adapted synthetically trained 2D semantic segmentation network.
IGNet achieves state-of-the-art results for weakly-supervised LiDAR semantic segmentation on ScribbleKITTI, boasting up to 98% relative performance to fully supervised training with only 8% labeled points.
arXiv Detail & Related papers (2023-11-27T07:57:29Z) - PONet: Robust 3D Human Pose Estimation via Learning Orientations Only [116.1502793612437]
We propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only.
PONet estimates the 3D orientation of these limbs by taking advantage of the local image evidence to recover the 3D pose.
We evaluate our method on multiple datasets, including Human3.6M, MPII, MPI-INF-3DHP, and 3DPW.
arXiv Detail & Related papers (2021-12-21T12:48:48Z) - Weakly-supervised Cross-view 3D Human Pose Estimation [16.045255544594625]
We propose a simple yet effective pipeline for weakly-supervised cross-view 3D human pose estimation.
Our method can achieve state-of-the-art performance in a weakly-supervised manner.
We evaluate our method on the standard benchmark dataset, Human3.6M.
arXiv Detail & Related papers (2021-05-23T08:16:25Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - Learning 3D Human Shape and Pose from Dense Body Parts [117.46290013548533]
We propose a Decompose-and-aggregate Network (DaNet) to learn 3D human shape and pose from dense correspondences of body parts.
Messages from local streams are aggregated to enhance the robust prediction of the rotation-based poses.
Our method is validated on both indoor and real-world datasets including Human3.6M, UP3D, COCO, and 3DPW.
arXiv Detail & Related papers (2019-12-31T15:09:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.