StreetSurf: Extending Multi-view Implicit Surface Reconstruction to
Street Views
- URL: http://arxiv.org/abs/2306.04988v1
- Date: Thu, 8 Jun 2023 07:19:27 GMT
- Title: StreetSurf: Extending Multi-view Implicit Surface Reconstruction to
Street Views
- Authors: Jianfei Guo, Nianchen Deng, Xinyang Li, Yeqi Bai, Botian Shi, Chiyu
Wang, Chenjing Ding, Dongliang Wang, Yikang Li
- Abstract summary: We present a novel multi-view implicit surface reconstruction technique, termed StreetSurf.
It is readily applicable to street view images in widely-used autonomous driving datasets, without necessarily requiring LiDAR data.
We achieve state of the art reconstruction quality in both geometry and appearance within only one to two hours of training time.
- Score: 6.35910814268525
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a novel multi-view implicit surface reconstruction technique,
termed StreetSurf, that is readily applicable to street view images in
widely-used autonomous driving datasets, such as Waymo-perception sequences,
without necessarily requiring LiDAR data. As neural rendering research expands
rapidly, its integration into street views has started to draw interests.
Existing approaches on street views either mainly focus on novel view synthesis
with little exploration of the scene geometry, or rely heavily on dense LiDAR
data when investigating reconstruction. Neither of them investigates multi-view
implicit surface reconstruction, especially under settings without LiDAR data.
Our method extends prior object-centric neural surface reconstruction
techniques to address the unique challenges posed by the unbounded street views
that are captured with non-object-centric, long and narrow camera trajectories.
We delimit the unbounded space into three parts, close-range, distant-view and
sky, with aligned cuboid boundaries, and adapt cuboid/hyper-cuboid hash-grids
along with road-surface initialization scheme for finer and disentangled
representation. To further address the geometric errors arising from
textureless regions and insufficient viewing angles, we adopt geometric priors
that are estimated using general purpose monocular models. Coupled with our
implementation of efficient and fine-grained multi-stage ray marching strategy,
we achieve state of the art reconstruction quality in both geometry and
appearance within only one to two hours of training time with a single RTX3090
GPU for each street view sequence. Furthermore, we demonstrate that the
reconstructed implicit surfaces have rich potential for various downstream
tasks, including ray tracing and LiDAR simulation.
Related papers
- StreetSurfGS: Scalable Urban Street Surface Reconstruction with Planar-based Gaussian Splatting [85.67616000086232]
StreetSurfGS is first method to employ Gaussian Splatting specifically tailored for scalable urban street scene surface reconstruction.
StreetSurfGS utilizes a planar-based octree representation and segmented training to reduce memory costs, accommodate unique camera characteristics, and ensure scalability.
To address sparse views and multi-scale challenges, we use a dual-step matching strategy that leverages adjacent and long-term information.
arXiv Detail & Related papers (2024-10-06T04:21:59Z) - Spurfies: Sparse Surface Reconstruction using Local Geometry Priors [8.260048622127913]
We introduce Spurfies, a novel method for sparse-view surface reconstruction.
It disentangles appearance and geometry information to utilize local geometry priors trained on synthetic data.
We validate our method on the DTU dataset and demonstrate that it outperforms previous state of the art by 35% in surface quality.
arXiv Detail & Related papers (2024-08-29T14:02:47Z) - Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces [34.831730064258494]
We propose Tactile-Informed 3DGS, a novel approach that incorporates touch data (local depth maps) with multi-view vision data to achieve surface reconstruction and novel view synthesis.
By creating a framework that decreases the transmittance at touch locations, we achieve a refined surface reconstruction, ensuring a uniformly smooth depth map.
We conduct evaluation on objects with glossy and reflective surfaces and demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-03-29T16:30:17Z) - SCILLA: SurfaCe Implicit Learning for Large Urban Area, a volumetric hybrid solution [4.216707699421813]
SCILLA is a new hybrid implicit surface learning method to reconstruct large driving scenes from 2D images.
We show that SCILLA can learn an accurate and detailed 3D surface scene representation in various urban scenarios.
arXiv Detail & Related papers (2024-03-15T14:31:17Z) - Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion [77.34078223594686]
We propose a novel architecture for direct 3D scene generation by introducing diffusion models into 3D sparse representations and combining them with neural rendering techniques.
Specifically, our approach generates texture colors at the point level for a given geometry using a 3D diffusion model first, which is then transformed into a scene representation in a feed-forward manner.
Experiments in two city-scale datasets show that our model demonstrates proficiency in generating photo-realistic street-view image sequences and cross-view urban scenes from satellite imagery.
arXiv Detail & Related papers (2024-01-19T16:15:37Z) - DiViNeT: 3D Reconstruction from Disparate Views via Neural Template
Regularization [7.488962492863031]
We present a volume rendering-based neural surface reconstruction method that takes as few as three disparate RGB images as input.
Our key idea is to regularize the reconstruction, which is severely ill-posed and leaving significant gaps between the sparse views.
Our approach achieves the best reconstruction quality among existing methods in the presence of such sparse views.
arXiv Detail & Related papers (2023-06-07T18:05:14Z) - Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface
Reconstruction [72.05649682685197]
State-of-the-art neural implicit methods allow for high-quality reconstructions of simple scenes from many input views.
This is caused primarily by the inherent ambiguity in the RGB reconstruction loss that does not provide enough constraints.
Motivated by recent advances in the area of monocular geometry prediction, we explore the utility these cues provide for improving neural implicit surface reconstruction.
arXiv Detail & Related papers (2022-06-01T17:58:15Z) - Multi-view 3D Reconstruction of a Texture-less Smooth Surface of Unknown
Generic Reflectance [86.05191217004415]
Multi-view reconstruction of texture-less objects with unknown surface reflectance is a challenging task.
This paper proposes a simple and robust solution to this problem based on a co-light scanner.
arXiv Detail & Related papers (2021-05-25T01:28:54Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.