Efficient Construction of Implicit Surface Models From a Single Image for Motion Generation
- URL: http://arxiv.org/abs/2509.20681v1
- Date: Thu, 25 Sep 2025 02:30:05 GMT
- Title: Efficient Construction of Implicit Surface Models From a Single Image for Motion Generation
- Authors: Wei-Teng Chu, Tianyi Zhang, Matthew Johnson-Roberson, Weiming Zhi,
- Abstract summary: We propose Fast Image-to-Neural Surface (FINS) to reconstruct high-fidelity surfaces and SDF fields based on a single or a small set of images.<n>FINS integrates a multi-resolution hash grid with lightweight geometry and color heads, making the training via an approximate second-order encoder highly efficient.<n>Our experiments demonstrate that under the same conditions, our method outperforms state-of-the-art baselines in both convergence speed and accuracy on surface reconstruction and SDF field estimation.
- Score: 10.958664484214127
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Implicit representations have been widely applied in robotics for obstacle avoidance and path planning. In this paper, we explore the problem of constructing an implicit distance representation from a single image. Past methods for implicit surface reconstruction, such as \emph{NeuS} and its variants generally require a large set of multi-view images as input, and require long training times. In this work, we propose Fast Image-to-Neural Surface (FINS), a lightweight framework that can reconstruct high-fidelity surfaces and SDF fields based on a single or a small set of images. FINS integrates a multi-resolution hash grid encoder with lightweight geometry and color heads, making the training via an approximate second-order optimizer highly efficient and capable of converging within a few seconds. Additionally, we achieve the construction of a neural surface requiring only a single RGB image, by leveraging pre-trained foundation models to estimate the geometry inherent in the image. Our experiments demonstrate that under the same conditions, our method outperforms state-of-the-art baselines in both convergence speed and accuracy on surface reconstruction and SDF field estimation. Moreover, we demonstrate the applicability of FINS for robot surface following tasks and show its scalability to a variety of benchmark datasets.
Related papers
- PR-NeuS: A Prior-based Residual Learning Paradigm for Fast Multi-view
Neural Surface Reconstruction [45.34454245176438]
We propose a prior-based residual learning paradigm for fast multi-view neural surface reconstruction.
Our method only takes about 3 minutes to reconstruct the surface of a single scene, while achieving competitive surface quality.
arXiv Detail & Related papers (2023-12-18T09:24:44Z) - Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids [84.90863397388776]
We propose to directly use signed distance function (SDF) in sparse voxel block grids for fast and accurate scene reconstruction without distances.
Our globally sparse and locally dense data structure exploits surfaces' spatial sparsity, enables cache-friendly queries, and allows direct extensions to multi-modal data.
Experiments show that our approach is 10x faster in training and 100x faster in rendering while achieving comparable accuracy to state-of-the-art neural implicit methods.
arXiv Detail & Related papers (2023-05-22T16:50:19Z) - GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from
Multi-view Images [79.39247661907397]
We introduce an effective framework Generalizable Model-based Neural Radiance Fields to synthesize free-viewpoint images.
Specifically, we propose a geometry-guided attention mechanism to register the appearance code from multi-view 2D images to a geometry proxy.
arXiv Detail & Related papers (2023-03-24T03:32:02Z) - Shape, Pose, and Appearance from a Single Image via Bootstrapped
Radiance Field Inversion [54.151979979158085]
We introduce a principled end-to-end reconstruction framework for natural images, where accurate ground-truth poses are not available.
We leverage an unconditional 3D-aware generator, to which we apply a hybrid inversion scheme where a model produces a first guess of the solution.
Our framework can de-render an image in as few as 10 steps, enabling its use in practical scenarios.
arXiv Detail & Related papers (2022-11-21T17:42:42Z) - High-Quality RGB-D Reconstruction via Multi-View Uncalibrated
Photometric Stereo and Gradient-SDF [48.29050063823478]
We present a novel multi-view RGB-D based reconstruction method that tackles camera pose, lighting, albedo, and surface normal estimation.
The proposed method formulates the image rendering process using specific physically-based model(s) and optimize the surface's volumetric quantities on the actual surface.
arXiv Detail & Related papers (2022-10-21T19:09:08Z) - SDFEst: Categorical Pose and Shape Estimation of Objects from RGB-D
using Signed Distance Fields [5.71097144710995]
We present a modular pipeline for pose and shape estimation of objects from RGB-D images.
We integrate a generative shape model with a novel network to enable 6D pose and shape estimation from a single or multiple views.
We demonstrate the benefits of our approach over state-of-the-art methods in several experiments on both synthetic and real data.
arXiv Detail & Related papers (2022-07-11T13:53:50Z) - FIRe: Fast Inverse Rendering using Directional and Signed Distance
Functions [97.5540646069663]
We introduce a novel neural scene representation that we call the directional distance function (DDF)
Our DDF is defined on the unit sphere and predicts the distance to the surface along any given direction.
Based on our DDF, we present a novel fast algorithm (FIRe) to reconstruct 3D shapes given a posed depth map.
arXiv Detail & Related papers (2022-03-30T13:24:04Z) - Spatial-Separated Curve Rendering Network for Efficient and
High-Resolution Image Harmonization [59.19214040221055]
We propose a novel spatial-separated curve rendering network (S$2$CRNet) for efficient and high-resolution image harmonization.
The proposed method reduces more than 90% parameters compared with previous methods.
Our method can work smoothly on higher resolution images in real-time which is more than 10$times$ faster than the existing methods.
arXiv Detail & Related papers (2021-09-13T07:20:16Z) - Learning Signed Distance Field for Multi-view Surface Reconstruction [24.090786783370195]
We introduce a novel neural surface reconstruction framework that leverages the knowledge of stereo matching and feature consistency.
We apply a signed distance field (SDF) and a surface light field to represent the scene geometry and appearance respectively.
Our method is able to improve the robustness of geometry estimation and support reconstruction of complex scene topologies.
arXiv Detail & Related papers (2021-08-23T06:23:50Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z) - Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF.
We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials.
Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.