HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images
- URL: http://arxiv.org/abs/2311.15672v2
- Date: Sun, 31 Mar 2024 09:10:24 GMT
- Title: HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images
- Authors: Xihe Yang, Xingyu Chen, Daiheng Gao, Shaohui Wang, Xiaoguang Han, Baoyuan Wang,
- Abstract summary: We study the reconstruction of human avatars from a few-shot unconstrained photo album.
For handling dynamic data, we integrate a skinning mechanism with deep marching tetrahedra.
Our framework, called HaveFun, can undertake avatar reconstruction, rendering, and animation.
- Score: 33.298962236215964
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: As for human avatar reconstruction, contemporary techniques commonly necessitate the acquisition of costly data and struggle to achieve satisfactory results from a small number of casual images. In this paper, we investigate this task from a few-shot unconstrained photo album. The reconstruction of human avatars from such data sources is challenging because of limited data amount and dynamic articulated poses. For handling dynamic data, we integrate a skinning mechanism with deep marching tetrahedra (DMTet) to form a drivable tetrahedral representation, which drives arbitrary mesh topologies generated by the DMTet for the adaptation of unconstrained images. To effectively mine instructive information from few-shot data, we devise a two-phase optimization method with few-shot reference and few-shot guidance. The former focuses on aligning avatar identity with reference images, while the latter aims to generate plausible appearances for unseen regions. Overall, our framework, called HaveFun, can undertake avatar reconstruction, rendering, and animation. Extensive experiments on our developed benchmarks demonstrate that HaveFun exhibits substantially superior performance in reconstructing the human body and hand. Project website: https://seanchenxy.github.io/HaveFunWeb/.
Related papers
- Human Body Restoration with One-Step Diffusion Model and A New Benchmark [74.66514054623669]
We propose a high-quality dataset automated cropping and filtering (HQ-ACF) pipeline.
This pipeline leverages existing object detection datasets and other unlabeled images to automatically crop and filter high-quality human images.
We also propose emphOSDHuman, a novel one-step diffusion model for human body restoration.
arXiv Detail & Related papers (2025-02-03T14:48:40Z) - WonderHuman: Hallucinating Unseen Parts in Dynamic 3D Human Reconstruction [51.22641018932625]
We present WonderHuman to reconstruct dynamic human avatars from a monocular video for high-fidelity novel view synthesis.
Our method achieves SOTA performance in producing photorealistic renderings from the given monocular video.
arXiv Detail & Related papers (2025-02-03T04:43:41Z) - Generalizable and Animatable Gaussian Head Avatar [50.34788590904843]
We propose Generalizable and Animatable Gaussian head Avatar (GAGAvatar) for one-shot animatable head avatar reconstruction.
We generate the parameters of 3D Gaussians from a single image in a single forward pass.
Our method exhibits superior performance compared to previous methods in terms of reconstruction quality and expression accuracy.
arXiv Detail & Related papers (2024-10-10T14:29:00Z) - MagicMirror: Fast and High-Quality Avatar Generation with a Constrained Search Space [25.24509617548819]
We introduce a novel framework for 3D human avatar generation and personalization, leveraging text prompts.
Key innovations are aimed at overcoming the challenges in photo-realistic avatar synthesis.
arXiv Detail & Related papers (2024-04-01T17:59:11Z) - Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence.
ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images.
Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z) - MoSAR: Monocular Semi-Supervised Model for Avatar Reconstruction using
Differentiable Shading [3.2586340344073927]
MoSAR is a method for 3D avatar generation from monocular images.
We propose a semi-supervised training scheme that improves generalization by learning from both light stage and in-the-wild datasets.
We also introduce a new dataset, named FFHQ-UV-Intrinsics, the first public dataset providing intrinsic face attributes at scale.
arXiv Detail & Related papers (2023-12-20T15:12:53Z) - NOFA: NeRF-based One-shot Facial Avatar Reconstruction [45.11455702291703]
3D facial avatar reconstruction has been a significant research topic in computer graphics and computer vision.
We propose a one-shot 3D facial avatar reconstruction framework that only requires a single source image to reconstruct a high-fidelity 3D facial avatar.
arXiv Detail & Related papers (2023-07-07T07:58:18Z) - Generalizable One-shot Neural Head Avatar [90.50492165284724]
We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image.
We propose a framework that not only generalizes to unseen identities based on a single-view image, but also captures characteristic details within and beyond the face area.
arXiv Detail & Related papers (2023-06-14T22:33:09Z) - AvatarCap: Animatable Avatar Conditioned Monocular Human Volumetric
Capture [36.10436374741757]
AvatarCap is a novel framework that introduces animatable avatars into the capture pipeline for high-fidelity reconstruction in both visible and invisible regions.
Our method integrates information from both the image observation and the avatar prior, and accordingly recon-structs high-fidelity 3D textured models with dynamic details regardless of the visibility.
arXiv Detail & Related papers (2022-07-05T13:21:01Z) - MVP-Human Dataset for 3D Human Avatar Reconstruction from Unconstrained
Frames [59.37430649840777]
We present 3D Avatar Reconstruction in the wild (ARwild), which first reconstructs the implicit skinning fields in a multi-level manner.
We contribute a large-scale dataset, MVP-Human, which contains 400 subjects, each of which has 15 scans in different poses.
Overall, benefits from the specific network architecture and the diverse data, the trained model enables 3D avatar reconstruction from unconstrained frames.
arXiv Detail & Related papers (2022-04-24T03:57:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.