Related papers: Deep3DPose: Realtime Reconstruction of Arbitrarily Posed Human Bodies from Single RGB Images

Deep3DPose: Realtime Reconstruction of Arbitrarily Posed Human Bodies from Single RGB Images

URL: http://arxiv.org/abs/2106.11536v1
Date: Tue, 22 Jun 2021 04:26:11 GMT
Title: Deep3DPose: Realtime Reconstruction of Arbitrarily Posed Human Bodies from Single RGB Images
Authors: Liguo Jiang, Miaopeng Li, Jianjie Zhang, Congyi Wang, Juntao Ye, Xinguo Liu, Jinxiang Chai
Abstract summary: We introduce an approach that accurately reconstructs 3D human poses and detailed 3D full-body geometric models from single images in realtime. Key idea of our approach is a novel end-to-end multi-task deep learning framework that uses single images to predict five outputs simultaneously. We show the system advances the frontier of 3D human body and pose reconstruction from single images by quantitative evaluations and comparisons with state-of-the-art methods.
Score: 5.775625085664381
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce an approach that accurately reconstructs 3D human poses and detailed 3D full-body geometric models from single images in realtime. The key idea of our approach is a novel end-to-end multi-task deep learning framework that uses single images to predict five outputs simultaneously: foreground segmentation mask, 2D joints positions, semantic body partitions, 3D part orientations and uv coordinates (uv map). The multi-task network architecture not only generates more visual cues for reconstruction, but also makes each individual prediction more accurate. The CNN regressor is further combined with an optimization based algorithm for accurate kinematic pose reconstruction and full-body shape modeling. We show that the realtime reconstruction reaches accurate fitting that has not been seen before, especially for wild images. We demonstrate the results of our realtime 3D pose and human body reconstruction system on various challenging in-the-wild videos. We show the system advances the frontier of 3D human body and pose reconstruction from single images by quantitative evaluations and comparisons with state-of-the-art methods.

Related papers

PF-LHM: 3D Animatable Avatar Reconstruction from Pose-free Articulated Human Images [23.745241278910946]
PF-LHM is a large human reconstruction model that generates high-quality 3D avatars in seconds from one or multiple casually captured pose-free images.<n>Our method unifies single- and multi-image 3D human reconstruction, achieving high-fidelity and animatable 3D human avatars without requiring camera and human pose annotations.
arXiv Detail & Related papers (2025-06-16T17:59:56Z)
Unify3D: An Augmented Holistic End-to-end Monocular 3D Human Reconstruction via Anatomy Shaping and Twins Negotiating [4.708237200844732]
This paper introduces a novel paradigm that treats human reconstruction as a holistic process. We propose a novel reconstruction framework consisting of two core components: the Anatomy Shaping Extraction module and the Twins Negotiating Reconstruction U-Net. We also propose a Comic Data Augmentation strategy and construct 15k+ 3D human scans to bolster model performance in more complex case input.
arXiv Detail & Related papers (2025-04-25T09:49:23Z)
HORT: Monocular Hand-held Objects Reconstruction with Transformers [61.36376511119355]
Reconstructing hand-held objects in 3D from monocular images is a significant challenge in computer vision. We propose a transformer-based model to efficiently reconstruct dense 3D point clouds of hand-held objects. Our method achieves state-of-the-art accuracy with much faster inference speed, while generalizing well to in-the-wild images.
arXiv Detail & Related papers (2025-03-27T09:45:09Z)
EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild [79.71523320368388]
Our work aims to reconstruct hand-object interactions from a single-view image. We first design a novel pipeline to estimate the underlying hand pose and object shape. With the initial reconstruction, we employ a prior-guided optimization scheme.
arXiv Detail & Related papers (2024-11-21T16:33:35Z)
Total-Decom: Decomposed 3D Scene Reconstruction with Minimal Interaction [51.3632308129838]
We present Total-Decom, a novel method for decomposed 3D reconstruction with minimal human interaction. Our approach seamlessly integrates the Segment Anything Model (SAM) with hybrid implicit-explicit neural surface representations and a mesh-based region-growing technique for accurate 3D object decomposition. We extensively evaluate our method on benchmark datasets and demonstrate its potential for downstream applications, such as animation and scene editing.
arXiv Detail & Related papers (2024-03-28T11:12:33Z)
SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion [35.73448283467723]
SiTH is a novel pipeline that integrates an image-conditioned diffusion model into a 3D mesh reconstruction workflow. We employ a powerful generative diffusion model to hallucinate unseen back-view appearance based on the input images. For the latter, we leverage skinned body meshes as guidance to recover full-body texture meshes from the input and back-view images.
arXiv Detail & Related papers (2023-11-27T14:22:07Z)
Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video [23.93644678238666]
We propose a Pose and Mesh Co-Evolution network (PMCE) to recover 3D human motion from a video. The proposed PMCE outperforms previous state-of-the-art methods in terms of both per-frame accuracy and temporal consistency.
arXiv Detail & Related papers (2023-08-20T16:03:21Z)
Hi-LASSIE: High-Fidelity Articulated Shape and Skeleton Discovery from Sparse Image Ensemble [72.3681707384754]
Hi-LASSIE performs 3D articulated reconstruction from only 20-30 online images in the wild without any user-defined shape or skeleton templates. First, instead of relying on a manually annotated 3D skeleton, we automatically estimate a class-specific skeleton from the selected reference image. Second, we improve the shape reconstructions with novel instance-specific optimization strategies that allow reconstructions to faithful fit on each instance.
arXiv Detail & Related papers (2022-12-21T14:31:33Z)
Learning Temporal 3D Human Pose Estimation with Pseudo-Labels [3.0954251281114513]
We present a simple, yet effective, approach for self-supervised 3D human pose estimation. We rely on triangulating 2D body pose estimates of a multiple-view camera system. Our method achieves state-of-the-art performance in the Human3.6M and MPI-INF-3DHP benchmarks.
arXiv Detail & Related papers (2021-10-14T17:40:45Z)
Fast-GANFIT: Generative Adversarial Network for High Fidelity 3D Face Reconstruction [76.1612334630256]
We harness the power of Generative Adversarial Networks (GANs) and Deep Convolutional Neural Networks (DCNNs) to reconstruct the facial texture and shape from single images. We demonstrate excellent results in photorealistic and identity preserving 3D face reconstructions and achieve for the first time, facial texture reconstruction with high-frequency details.
arXiv Detail & Related papers (2021-05-16T16:35:44Z)
Multi-person Implicit Reconstruction from a Single Image [37.6877421030774]
We present a new end-to-end learning framework to obtain detailed and spatially coherent reconstructions of multiple people from a single image. Existing multi-person methods suffer from two main drawbacks: they are often model-based and cannot capture accurate 3D models of people with loose clothing and hair.
arXiv Detail & Related papers (2021-04-19T13:21:55Z)
A Divide et Impera Approach for 3D Shape Reconstruction from Multiple Views [49.03830902235915]
Estimating the 3D shape of an object from a single or multiple images has gained popularity thanks to the recent breakthroughs powered by deep learning. This paper proposes to rely on viewpoint variant reconstructions by merging the visible information from the given views. To validate the proposed method, we perform a comprehensive evaluation on the ShapeNet reference benchmark in terms of relative pose estimation and 3D shape reconstruction.
arXiv Detail & Related papers (2020-11-17T09:59:32Z)
Coherent Reconstruction of Multiple Humans from a Single Image [68.3319089392548]
In this work, we address the problem of multi-person 3D pose estimation from a single image. A typical regression approach in the top-down setting of this problem would first detect all humans and then reconstruct each one of them independently. Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene.
arXiv Detail & Related papers (2020-06-15T17:51:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.