3D Human Shape and Pose from a Single Low-Resolution Image with
Self-Supervised Learning
- URL: http://arxiv.org/abs/2007.13666v2
- Date: Sun, 9 Aug 2020 17:22:43 GMT
- Title: 3D Human Shape and Pose from a Single Low-Resolution Image with
Self-Supervised Learning
- Authors: Xiangyu Xu, Hao Chen, Francesc Moreno-Noguer, Laszlo A. Jeni, Fernando
De la Torre
- Abstract summary: Existing deep learning methods for 3D human shape and pose estimation rely on relatively high-resolution input images.
We propose RSC-Net, which consists of a Resolution-aware network, a Self-supervision loss, and a Contrastive learning scheme.
We show that both these new training losses provide robustness when learning 3D shape and pose in a weakly-supervised manner.
- Score: 105.49950571267715
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D human shape and pose estimation from monocular images has been an active
area of research in computer vision, having a substantial impact on the
development of new applications, from activity recognition to creating virtual
avatars. Existing deep learning methods for 3D human shape and pose estimation
rely on relatively high-resolution input images; however, high-resolution
visual content is not always available in several practical scenarios such as
video surveillance and sports broadcasting. Low-resolution images in real
scenarios can vary in a wide range of sizes, and a model trained in one
resolution does not typically degrade gracefully across resolutions. Two common
approaches to solve the problem of low-resolution input are applying
super-resolution techniques to the input images which may result in visual
artifacts, or simply training one model for each resolution, which is
impractical in many realistic applications. To address the above issues, this
paper proposes a novel algorithm called RSC-Net, which consists of a
Resolution-aware network, a Self-supervision loss, and a Contrastive learning
scheme. The proposed network is able to learn the 3D body shape and pose across
different resolutions with a single model. The self-supervision loss encourages
scale-consistency of the output, and the contrastive learning scheme enforces
scale-consistency of the deep features. We show that both these new training
losses provide robustness when learning 3D shape and pose in a
weakly-supervised manner. Extensive experiments demonstrate that the RSC-Net
can achieve consistently better results than the state-of-the-art methods for
challenging low-resolution images.
Related papers
- Markerless Multi-view 3D Human Pose Estimation: a survey [0.49157446832511503]
3D human pose estimation aims to reconstruct the human skeleton of all the individuals in a scene by detecting several body joints.
No method is yet capable of solving all the challenges associated with the reconstruction of the 3D pose.
Further research is still required to develop an approach capable of quickly inferring a highly accurate 3D pose with bearable computation cost.
arXiv Detail & Related papers (2024-07-04T10:44:35Z) - 3D-Augmented Contrastive Knowledge Distillation for Image-based Object
Pose Estimation [4.415086501328683]
We deal with the problem in a reasonable new setting, namely 3D shape is exploited in the training process, and the testing is still purely image-based.
We propose a novel contrastive knowledge distillation framework that effectively transfers 3D-augmented image representation from a multi-modal model to an image-based model.
We experimentally report state-of-the-art results compared with existing category-agnostic image-based methods by a large margin.
arXiv Detail & Related papers (2022-06-02T16:46:18Z) - 3D Human Pose, Shape and Texture from Low-Resolution Images and Videos [107.36352212367179]
We propose RSC-Net, which consists of a Resolution-aware network, a Self-supervision loss, and a Contrastive learning scheme.
The proposed method is able to learn 3D body pose and shape across different resolutions with one single model.
We extend the RSC-Net to handle low-resolution videos and apply it to reconstruct textured 3D pedestrians from low-resolution input.
arXiv Detail & Related papers (2021-03-11T06:52:12Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - Learning Pose-invariant 3D Object Reconstruction from Single-view Images [61.98279201609436]
In this paper, we explore a more realistic setup of learning 3D shape from only single-view images.
The major difficulty lies in insufficient constraints that can be provided by single view images.
We propose an effective adversarial domain confusion method to learn pose-disentangled compact shape space.
arXiv Detail & Related papers (2020-04-03T02:47:35Z) - Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the
Wild [101.70320427145388]
We propose a weakly-supervised approach that does not require 3D annotations and learns to estimate 3D poses from unlabeled multi-view data.
We evaluate our proposed approach on two large scale datasets.
arXiv Detail & Related papers (2020-03-17T08:47:16Z) - Chained Representation Cycling: Learning to Estimate 3D Human Pose and
Shape by Cycling Between Representations [73.11883464562895]
We propose a new architecture that facilitates unsupervised, or lightly supervised, learning.
We demonstrate the method by learning 3D human pose and shape from un-paired and un-annotated images.
While we present results for modeling humans, our formulation is general and can be applied to other vision problems.
arXiv Detail & Related papers (2020-01-06T14:54:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.