SOAR: Self-Occluded Avatar Recovery from a Single Video In the Wild
- URL: http://arxiv.org/abs/2410.23800v1
- Date: Thu, 31 Oct 2024 10:35:59 GMT
- Title: SOAR: Self-Occluded Avatar Recovery from a Single Video In the Wild
- Authors: Zhuoyang Pan, Angjoo Kanazawa, Hang Gao,
- Abstract summary: Self-occlusion is common when capturing people in the wild, where the performer do not follow predefined motion scripts.
We introduce Self-Occluded Avatar Recovery (SOAR), a method for complete human reconstruction from partial observations where parts of the body are entirely unobserved.
- Score: 30.728476070389707
- License:
- Abstract: Self-occlusion is common when capturing people in the wild, where the performer do not follow predefined motion scripts. This challenges existing monocular human reconstruction systems that assume full body visibility. We introduce Self-Occluded Avatar Recovery (SOAR), a method for complete human reconstruction from partial observations where parts of the body are entirely unobserved. SOAR leverages structural normal prior and generative diffusion prior to address such an ill-posed reconstruction problem. For structural normal prior, we model human with an reposable surfel model with well-defined and easily readable shapes. For generative diffusion prior, we perform an initial reconstruction and refine it using score distillation. On various benchmarks, we show that SOAR performs favorably than state-of-the-art reconstruction and generation methods, and on-par comparing to concurrent works. Additional video results and code are available at https://soar-avatar.github.io/.
Related papers
- Divide and Fuse: Body Part Mesh Recovery from Partially Visible Human Images [57.479339658504685]
"Divide and Fuse" strategy reconstructs human body parts independently before fusing them.
Human Part Parametric Models (HPPM) independently reconstruct the mesh from a few shape and global-location parameters.
A specially designed fusion module seamlessly integrates the reconstructed parts, even when only a few are visible.
arXiv Detail & Related papers (2024-07-12T21:29:11Z) - Stratified Avatar Generation from Sparse Observations [10.291918304187769]
Estimating 3D full-body avatars from AR/VR devices is essential for creating immersive experiences.
In this paper, we are inspired by the inherent property of the kinematic tree defined in the Skinned Multi-Person Linear (SMPL) model.
We propose a stratified approach to decouple the conventional full-body avatar reconstruction pipeline into two stages.
arXiv Detail & Related papers (2024-05-30T06:25:42Z) - SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion [35.73448283467723]
SiTH is a novel pipeline that integrates an image-conditioned diffusion model into a 3D mesh reconstruction workflow.
We employ a powerful generative diffusion model to hallucinate unseen back-view appearance based on the input images.
For the latter, we leverage skinned body meshes as guidance to recover full-body texture meshes from the input and back-view images.
arXiv Detail & Related papers (2023-11-27T14:22:07Z) - Humans in 4D: Reconstructing and Tracking Humans with Transformers [72.50856500760352]
We present an approach to reconstruct humans and track them over time.
At the core of our approach, we propose a fully "transformerized" version of a network for human mesh recovery.
This network, HMR 2.0, advances the state of the art and shows the capability to analyze unusual poses that have in the past been difficult to reconstruct from single images.
arXiv Detail & Related papers (2023-05-31T17:59:52Z) - Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via
Self-supervised Scene Decomposition [40.46674919612935]
We present Vid2Avatar, a method to learn human avatars from monocular in-the-wild videos.
Our method does not require any groundtruth supervision or priors extracted from large datasets of clothed human scans.
It solves the tasks of scene decomposition and surface reconstruction directly in 3D by modeling both the human and the background in the scene jointly.
arXiv Detail & Related papers (2023-02-22T18:59:17Z) - RealFusion: 360{\deg} Reconstruction of Any Object from a Single Image [98.46318529630109]
We consider the problem of reconstructing a full 360deg photographic model of an object from a single image.
We take an off-the-self conditional image generator based on diffusion and engineer a prompt that encourages it to "dream up" novel views of the object.
arXiv Detail & Related papers (2023-02-21T13:25:35Z) - Making Reconstruction-based Method Great Again for Video Anomaly
Detection [64.19326819088563]
Anomaly detection in videos is a significant yet challenging problem.
Existing reconstruction-based methods rely on old-fashioned convolutional autoencoders.
We propose a new autoencoder model for enhanced consecutive frame reconstruction.
arXiv Detail & Related papers (2023-01-28T01:57:57Z) - SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video [48.23424267130425]
SelfRecon recovers space-time coherent geometries from a monocular self-rotating human video.
Explicit methods require a predefined template mesh for a given sequence, while the template is hard to acquire for a specific subject.
Implicit methods support arbitrary topology and have high quality due to continuous geometric representation.
arXiv Detail & Related papers (2022-01-30T11:49:29Z) - Coherent Reconstruction of Multiple Humans from a Single Image [68.3319089392548]
In this work, we address the problem of multi-person 3D pose estimation from a single image.
A typical regression approach in the top-down setting of this problem would first detect all humans and then reconstruct each one of them independently.
Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene.
arXiv Detail & Related papers (2020-06-15T17:51:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.