2S-UDF: A Novel Two-stage UDF Learning Method for Robust Non-watertight Model Reconstruction from Multi-view Images
- URL: http://arxiv.org/abs/2303.15368v3
- Date: Tue, 16 Apr 2024 14:08:03 GMT
- Title: 2S-UDF: A Novel Two-stage UDF Learning Method for Robust Non-watertight Model Reconstruction from Multi-view Images
- Authors: Junkai Deng, Fei Hou, Xuhui Chen, Wencheng Wang, Ying He,
- Abstract summary: We present a novel two-stage algorithm, 2S-UDF, for learning a high-quality UDF from multi-view images.
In both quantitative metrics and visual quality, the results indicate our superior performance over other UDF learning techniques.
- Score: 12.076881343401329
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, building on the foundation of neural radiance field, various techniques have emerged to learn unsigned distance fields (UDF) to reconstruct 3D non-watertight models from multi-view images. Yet, a central challenge in UDF-based volume rendering is formulating a proper way to convert unsigned distance values into volume density, ensuring that the resulting weight function remains unbiased and sensitive to occlusions. Falling short on these requirements often results in incorrect topology or large reconstruction errors in resulting models. This paper addresses this challenge by presenting a novel two-stage algorithm, 2S-UDF, for learning a high-quality UDF from multi-view images. Initially, the method applies an easily trainable density function that, while slightly biased and transparent, aids in coarse reconstruction. The subsequent stage then refines the geometry and appearance of the object to achieve a high-quality reconstruction by directly adjusting the weight function used in volume rendering to ensure that it is unbiased and occlusion-aware. Decoupling density and weight in two stages makes our training stable and robust, distinguishing our technique from existing UDF learning approaches. Evaluations on the DeepFashion3D, DTU, and BlendedMVS datasets validate the robustness and effectiveness of our proposed approach. In both quantitative metrics and visual quality, the results indicate our superior performance over other UDF learning techniques in reconstructing 3D non-watertight models from multi-view images. Our code is available at https://bitbucket.org/jkdeng/2sudf/.
Related papers
- DCUDF2: Improving Efficiency and Accuracy in Extracting Zero Level Sets from Unsigned Distance Fields [11.397415082340482]
Unsigned distance fields (UDFs) allow for the representation of models with complex topologies, but accurate zero level sets from these fields poses significant challenges.
We introduceDF2, an enhancement over the current state-of-the-art method--for extracting zero level sets from UDFs.
Our approach utilizes an accuracy-aware loss function, enhanced with self-adaptive weights, to improve geometric quality significantly.
arXiv Detail & Related papers (2024-08-30T13:31:15Z) - Pixel-Aligned Multi-View Generation with Depth Guided Decoder [86.1813201212539]
We propose a novel method for pixel-level image-to-multi-view generation.
Unlike prior work, we incorporate attention layers across multi-view images in the VAE decoder of a latent video diffusion model.
Our model enables better pixel alignment across multi-view images.
arXiv Detail & Related papers (2024-08-26T04:56:41Z) - FNOSeg3D: Resolution-Robust 3D Image Segmentation with Fourier Neural
Operator [4.48473804240016]
We introduce FNOSeg3D, a 3D segmentation model robust to training image resolution based on the Fourier neural operator (FNO)
When tested on the BraTS'19 dataset, it achieved superior robustness to training image resolution than other tested models with less than 1% of their model parameters.
arXiv Detail & Related papers (2023-10-05T19:58:36Z) - Sparse3D: Distilling Multiview-Consistent Diffusion for Object
Reconstruction from Sparse Views [47.215089338101066]
We present Sparse3D, a novel 3D reconstruction method tailored for sparse view inputs.
Our approach distills robust priors from a multiview-consistent diffusion model to refine a neural radiance field.
By tapping into 2D priors from powerful image diffusion models, our integrated model consistently delivers high-quality results.
arXiv Detail & Related papers (2023-08-27T11:52:00Z) - NeuralUDF: Learning Unsigned Distance Fields for Multi-view
Reconstruction of Surfaces with Arbitrary Topologies [87.06532943371575]
We present a novel method, called NeuralUDF, for reconstructing surfaces with arbitrary topologies from 2D images via volume rendering.
In this paper, we propose to represent surfaces as the Unsigned Distance Function (UDF) and develop a new volume rendering scheme to learn the neural UDF representation.
arXiv Detail & Related papers (2022-11-25T15:21:45Z) - NeuralReshaper: Single-image Human-body Retouching with Deep Neural
Networks [50.40798258968408]
We present NeuralReshaper, a novel method for semantic reshaping of human bodies in single images using deep generative networks.
Our approach follows a fit-then-reshape pipeline, which first fits a parametric 3D human model to a source human image.
To deal with the lack-of-data problem that no paired data exist, we introduce a novel self-supervised strategy to train our network.
arXiv Detail & Related papers (2022-03-20T09:02:13Z) - 3D Human Pose, Shape and Texture from Low-Resolution Images and Videos [107.36352212367179]
We propose RSC-Net, which consists of a Resolution-aware network, a Self-supervision loss, and a Contrastive learning scheme.
The proposed method is able to learn 3D body pose and shape across different resolutions with one single model.
We extend the RSC-Net to handle low-resolution videos and apply it to reconstruct textured 3D pedestrians from low-resolution input.
arXiv Detail & Related papers (2021-03-11T06:52:12Z) - Recurrent Multi-view Alignment Network for Unsupervised Surface
Registration [79.72086524370819]
Learning non-rigid registration in an end-to-end manner is challenging due to the inherent high degrees of freedom and the lack of labeled training data.
We propose to represent the non-rigid transformation with a point-wise combination of several rigid transformations.
We also introduce a differentiable loss function that measures the 3D shape similarity on the projected multi-view 2D depth images.
arXiv Detail & Related papers (2020-11-24T14:22:42Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.