Body Size and Depth Disambiguation in Multi-Person Reconstruction from
Single Images
- URL: http://arxiv.org/abs/2111.01884v1
- Date: Tue, 2 Nov 2021 20:42:41 GMT
- Title: Body Size and Depth Disambiguation in Multi-Person Reconstruction from
Single Images
- Authors: Nicolas Ugrinovic, Adria Ruiz, Antonio Agudo, Alberto Sanfeliu,
Francesc Moreno-Noguer
- Abstract summary: We address the problem of multi-person 3D body pose and shape estimation from a single image.
We devise a novel optimization scheme that learns the appropriate body scale and relative camera pose, by enforcing the feet of all people to remain on the ground floor.
A thorough evaluation on MuPoTS-3D and 3DPW datasets demonstrates that our approach is able to robustly estimate the body translation and shape of multiple people while retrieving their spatial arrangement.
- Score: 44.96633481495911
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We address the problem of multi-person 3D body pose and shape estimation from
a single image. While this problem can be addressed by applying single-person
approaches multiple times for the same scene, recent works have shown the
advantages of building upon deep architectures that simultaneously reason about
all people in the scene in a holistic manner by enforcing, e.g., depth order
constraints or minimizing interpenetration among reconstructed bodies. However,
existing approaches are still unable to capture the size variability of people
caused by the inherent body scale and depth ambiguity. In this work, we tackle
this challenge by devising a novel optimization scheme that learns the
appropriate body scale and relative camera pose, by enforcing the feet of all
people to remain on the ground floor. A thorough evaluation on MuPoTS-3D and
3DPW datasets demonstrates that our approach is able to robustly estimate the
body translation and shape of multiple people while retrieving their spatial
arrangement, consistently improving current state-of-the-art, especially in
scenes with people of very different heights
Related papers
- AvatarPose: Avatar-guided 3D Pose Estimation of Close Human Interaction from Sparse Multi-view Videos [31.904839609743448]
Existing multi-view methods often face challenges in estimating the 3D pose and shape of multiple closely interacting people.
We propose a novel method leveraging the personalized implicit neural avatar of each individual as a prior.
Our experimental results demonstrate state-of-the-art performance on several public datasets.
arXiv Detail & Related papers (2024-08-04T18:41:35Z) - Scene-Aware 3D Multi-Human Motion Capture from a Single Camera [83.06768487435818]
We consider the problem of estimating the 3D position of multiple humans in a scene as well as their body shape and articulation from a single RGB video recorded with a static camera.
We leverage recent advances in computer vision using large-scale pre-trained models for a variety of modalities, including 2D body joints, joint angles, normalized disparity maps, and human segmentation masks.
In particular, we estimate the scene depth and unique person scale from normalized disparity predictions using the 2D body joints and joint angles.
arXiv Detail & Related papers (2023-01-12T18:01:28Z) - Single-view 3D Body and Cloth Reconstruction under Complex Poses [37.86174829271747]
We extend existing implicit function-based models to deal with images of humans with arbitrary poses and self-occluded limbs.
We learn an implicit function that maps the input image to a 3D body shape with a low level of detail.
We then learn a displacement map, conditioned on the smoothed surface, which encodes the high-frequency details of the clothes and body.
arXiv Detail & Related papers (2022-05-09T07:34:06Z) - Dual networks based 3D Multi-Person Pose Estimation from Monocular Video [42.01876518017639]
Multi-person 3D pose estimation is more challenging than single pose estimation.
Existing top-down and bottom-up approaches to pose estimation suffer from detection errors.
We propose the integration of top-down and bottom-up approaches to exploit their strengths.
arXiv Detail & Related papers (2022-05-02T08:53:38Z) - SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation [46.85865451812981]
We propose a novel system that first regresses a set of 2.5D representations of body parts and then reconstructs the 3D absolute poses based on these 2.5D representations with a depth-aware part association algorithm.
Such a single-shot bottom-up scheme allows the system to better learn and reason about the inter-person depth relationship, improving both 3D and 2D pose estimation.
arXiv Detail & Related papers (2020-08-26T09:56:07Z) - HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular
Multi-Person 3D Pose Estimation [54.23770284299979]
This paper introduces a novel form of supervision - Hierarchical Multi-person Ordinal Relations (HMOR)
HMOR encodes interaction information as the ordinal relations of depths and angles hierarchically.
An integrated top-down model is designed to leverage these ordinal relations in the learning process.
The proposed method significantly outperforms state-of-the-art methods on publicly available multi-person 3D pose datasets.
arXiv Detail & Related papers (2020-08-01T07:53:27Z) - Coherent Reconstruction of Multiple Humans from a Single Image [68.3319089392548]
In this work, we address the problem of multi-person 3D pose estimation from a single image.
A typical regression approach in the top-down setting of this problem would first detect all humans and then reconstruct each one of them independently.
Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene.
arXiv Detail & Related papers (2020-06-15T17:51:45Z) - Learning Pose-invariant 3D Object Reconstruction from Single-view Images [61.98279201609436]
In this paper, we explore a more realistic setup of learning 3D shape from only single-view images.
The major difficulty lies in insufficient constraints that can be provided by single view images.
We propose an effective adversarial domain confusion method to learn pose-disentangled compact shape space.
arXiv Detail & Related papers (2020-04-03T02:47:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.