FOF-X: Towards Real-time Detailed Human Reconstruction from a Single Image
- URL: http://arxiv.org/abs/2412.05961v1
- Date: Sun, 08 Dec 2024 14:46:29 GMT
- Title: FOF-X: Towards Real-time Detailed Human Reconstruction from a Single Image
- Authors: Qiao Feng, Yebin Liu, Yu-Kun Lai, Jingyu Yang, Kun Li,
- Abstract summary: We introduce FOF-X for real-time reconstruction of detailed human geometry from a single image.
FOF-X avoids the performance degradation caused by texture and lighting.
We enhance the inter-conversion algorithms between FOF and mesh representations with a Laplacian constraint and an automaton-based discontinuity matcher.
- Score: 68.84221452621674
- License:
- Abstract: We introduce FOF-X for real-time reconstruction of detailed human geometry from a single image. Balancing real-time speed against high-quality results is a persistent challenge, mainly due to the high computational demands of existing 3D representations. To address this, we propose Fourier Occupancy Field (FOF), an efficient 3D representation by learning the Fourier series. The core of FOF is to factorize a 3D occupancy field into a 2D vector field, retaining topology and spatial relationships within the 3D domain while facilitating compatibility with 2D convolutional neural networks. Such a representation bridges the gap between 3D and 2D domains, enabling the integration of human parametric models as priors and enhancing the reconstruction robustness. Based on FOF, we design a new reconstruction framework, FOF-X, to avoid the performance degradation caused by texture and lighting. This enables our real-time reconstruction system to better handle the domain gap between training images and real images. Additionally, in FOF-X, we enhance the inter-conversion algorithms between FOF and mesh representations with a Laplacian constraint and an automaton-based discontinuity matcher, improving both quality and robustness. We validate the strengths of our approach on different datasets and real-captured data, where FOF-X achieves new state-of-the-art results. The code will be released for research purposes.
Related papers
- Consistency Diffusion Models for Single-Image 3D Reconstruction with Priors [24.086775858948755]
We introduce a pioneering training framework under diffusion models.
We convert 3D structural priors derived from the initial 3D point cloud as a bound term.
We extract and incorporate 2D priors from the single input image, projecting them onto the 3D point cloud to enrich the guidance for diffusion training.
arXiv Detail & Related papers (2025-01-28T06:21:57Z) - Improving Geometry in Sparse-View 3DGS via Reprojection-based DoF Separation [35.17953057142724]
Recent learning-based Multi-View Stereo models have demonstrated state-of-the-art performance in sparse-view 3D reconstruction.
We propose reprojection-based DoF separation, a method distinguishing positional DoFs in terms of uncertainty.
We show that separating the positional DoFs of Gaussians and applying targeted constraints effectively suppresses geometric artifacts.
arXiv Detail & Related papers (2024-12-19T06:39:28Z) - DiHuR: Diffusion-Guided Generalizable Human Reconstruction [51.31232435994026]
We introduce DiHuR, a Diffusion-guided model for generalizable Human 3D Reconstruction and view synthesis from sparse, minimally overlapping images.
Our method integrates two key priors in a coherent manner: the prior from generalizable feed-forward models and the 2D diffusion prior, and it requires only multi-view image training, without 3D supervision.
arXiv Detail & Related papers (2024-11-16T03:52:23Z) - FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with Pre-trained Vision-Language Models [59.13757801286343]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.
We introduce the FILP-3D framework with two novel components: the Redundant Feature Eliminator (RFE) for feature space misalignment and the Spatial Noise Compensator (SNC) for significant noise.
arXiv Detail & Related papers (2023-12-28T14:52:07Z) - 2S-UDF: A Novel Two-stage UDF Learning Method for Robust Non-watertight Model Reconstruction from Multi-view Images [12.076881343401329]
We present a novel two-stage algorithm, 2S-UDF, for learning a high-quality UDF from multi-view images.
In both quantitative metrics and visual quality, the results indicate our superior performance over other UDF learning techniques.
arXiv Detail & Related papers (2023-03-27T16:35:28Z) - Factor Fields: A Unified Framework for Neural Fields and Beyond [50.29013417187368]
We present Factor Fields, a novel framework for modeling and representing signals.
Our framework accommodates several recent signal representations including NeRF, Plenoxels, EG3D, Instant-NGP, and TensoRF.
Our representation achieves better image approximation quality on 2D image regression tasks, higher geometric quality when reconstructing 3D signed distance fields, and higher compactness for radiance field reconstruction tasks.
arXiv Detail & Related papers (2023-02-02T17:06:50Z) - Fast-SNARF: A Fast Deformer for Articulated Neural Fields [92.68788512596254]
We propose a new articulation module for neural fields, Fast-SNARF, which finds accurate correspondences between canonical space and posed space.
Fast-SNARF is a drop-in replacement in to our previous work, SNARF, while significantly improving its computational efficiency.
Because learning of deformation maps is a crucial component in many 3D human avatar methods, we believe that this work represents a significant step towards the practical creation of 3D virtual humans.
arXiv Detail & Related papers (2022-11-28T17:55:34Z) - FOF: Learning Fourier Occupancy Field for Monocular Real-time Human
Reconstruction [73.85709132666626]
Existing representations, such as parametric models, voxel grids, meshes and implicit neural representations, have difficulties achieving high-quality results and real-time speed at the same time.
We propose Fourier Occupancy Field (FOF), a novel powerful, efficient and flexible 3D representation, for monocular real-time and accurate human reconstruction.
A FOF can be stored as a multi-channel image, which is compatible with 2D convolutional neural networks and can bridge the gap between 3D and 2D images.
arXiv Detail & Related papers (2022-06-05T14:45:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.