GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D
  Scene Understanding
        - URL: http://arxiv.org/abs/2403.03608v1
- Date: Wed, 6 Mar 2024 10:55:50 GMT
- Title: GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D
  Scene Understanding
- Authors: Zi-Ting Chou, Sheng-Yu Huang, I-Jieh Liu, Yu-Chiang Frank Wang
- Abstract summary: We introduce a Generalizable Semantic Neural Radiance Field (GSNeRF), which takes image semantics into the synthesis process.
Our GSNeRF is composed of two stages: Semantic Geo-Reasoning and Depth-Guided Visual rendering.
- Score: 30.951440204237166
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Utilizing multi-view inputs to synthesize novel-view images, Neural Radiance
Fields (NeRF) have emerged as a popular research topic in 3D vision. In this
work, we introduce a Generalizable Semantic Neural Radiance Field (GSNeRF),
which uniquely takes image semantics into the synthesis process so that both
novel view images and the associated semantic maps can be produced for unseen
scenes. Our GSNeRF is composed of two stages: Semantic Geo-Reasoning and
Depth-Guided Visual rendering. The former is able to observe multi-view image
inputs to extract semantic and geometry features from a scene. Guided by the
resulting image geometry information, the latter performs both image and
semantic rendering with improved performances. Our experiments not only confirm
that GSNeRF performs favorably against prior works on both novel-view image and
semantic segmentation synthesis but the effectiveness of our sampling strategy
for visual rendering is further verified.
 
      
        Related papers
        - Uni3R: Unified 3D Reconstruction and Semantic Understanding via   Generalizable Gaussian Splatting from Unposed Multi-View Images [36.084665557986156]
 Reconstructing and semantically interpreting 3D scenes from sparse 2D views remains a fundamental challenge in computer vision.<n>In this paper, we introduce Uni3R, a novel feed-forward framework that jointly reconstructs a unified 3D scene representation enriched with open-vocabulary semantics.
 arXiv  Detail & Related papers  (2025-08-05T16:54:55Z)
- OGGSplat: Open Gaussian Growing for Generalizable Reconstruction with   Expanded Field-of-View [74.58230239274123]
 We propose OGGSplat, an open Gaussian growing method that expands the field-of-view in generalizable 3D reconstruction.<n>Our key insight is that the semantic attributes of open Gaussians provide strong priors for image extrapolation.<n> OGGSplat also demonstrates promising semantic-aware scene reconstruction capabilities when provided with two view images captured directly from a smartphone camera.
 arXiv  Detail & Related papers  (2025-06-05T16:17:18Z)
- MOSE: Monocular Semantic Reconstruction Using NeRF-Lifted Noisy Priors [11.118490283303407]
 We propose a neural field semantic reconstruction approach to lift inferred image-level noisy priors to 3D.
Our method produces accurate semantics and geometry in both 3D and 2D space.
 arXiv  Detail & Related papers  (2024-09-21T05:12:13Z)
- Semantically-aware Neural Radiance Fields for Visual Scene
  Understanding: A Comprehensive Review [26.436253160392123]
 Review thoroughly examines the role of semantically-aware Neural Radiance Fields (NeRFs) in visual scene understanding.
NeRFs adeptly infer 3D representations for both stationary and dynamic objects in a scene.
 arXiv  Detail & Related papers  (2024-02-17T00:15:09Z)
- HG3-NeRF: Hierarchical Geometric, Semantic, and Photometric Guided
  Neural Radiance Fields for Sparse View Inputs [7.715395970689711]
 We introduce Hierarchical Geometric, Semantic, and Photometric Guided NeRF (HG3-NeRF)
HG3-NeRF is a novel methodology that can address the limitation and enhance consistency of geometry, semantic content, and appearance across different views.
 Experimental results demonstrate that HG3-NeRF can outperform other state-of-the-art methods on different standard benchmarks.
 arXiv  Detail & Related papers  (2024-01-22T06:28:08Z)
- GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from
  Multi-view Images [79.39247661907397]
 We introduce an effective framework Generalizable Model-based Neural Radiance Fields to synthesize free-viewpoint images.
Specifically, we propose a geometry-guided attention mechanism to register the appearance code from multi-view 2D images to a geometry proxy.
 arXiv  Detail & Related papers  (2023-03-24T03:32:02Z)
- Multi-Plane Neural Radiance Fields for Novel View Synthesis [5.478764356647437]
 Novel view synthesis is a long-standing problem that revolves around rendering frames of scenes from novel camera viewpoints.
In this work, we examine the performance, generalization, and efficiency of single-view multi-plane neural radiance fields.
We propose a new multiplane NeRF architecture that accepts multiple views to improve the synthesis results and expand the viewing range.
 arXiv  Detail & Related papers  (2023-03-03T06:32:55Z)
- SegNeRF: 3D Part Segmentation with Neural Radiance Fields [63.12841224024818]
 SegNeRF is a neural field representation that integrates a semantic field along with the usual radiance field.
SegNeRF is capable of simultaneously predicting geometry, appearance, and semantic information from posed images, even for unseen objects.
SegNeRF is able to generate an explicit 3D model from a single image of an object taken in the wild, with its corresponding part segmentation.
 arXiv  Detail & Related papers  (2022-11-21T07:16:03Z)
- CompNVS: Novel View Synthesis with Scene Completion [83.19663671794596]
 We propose a generative pipeline performing on a sparse grid-based neural scene representation to complete unobserved scene parts.
We process encoded image features in 3D space with a geometry completion network and a subsequent texture inpainting network to extrapolate the missing area.
Photorealistic image sequences can be finally obtained via consistency-relevant differentiable rendering.
 arXiv  Detail & Related papers  (2022-07-23T09:03:13Z)
- Vision Transformer for NeRF-Based View Synthesis from a Single Input
  Image [49.956005709863355]
 We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
 arXiv  Detail & Related papers  (2022-07-12T17:52:04Z)
- Semantic View Synthesis [56.47999473206778]
 We tackle a new problem of semantic view synthesis -- generating free-viewpoint rendering of a synthesized scene using a semantic label map as input.
First, we focus on synthesizing the color and depth of the visible surface of the 3D scene.
We then use the synthesized color and depth to impose explicit constraints on the multiple-plane image (MPI) representation prediction process.
 arXiv  Detail & Related papers  (2020-08-24T17:59:46Z)
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis [78.5281048849446]
 We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes.
Our algorithm represents a scene using a fully-connected (non-convolutional) deep network.
Because volume rendering is naturally differentiable, the only input required to optimize our representation is a set of images with known camera poses.
 arXiv  Detail & Related papers  (2020-03-19T17:57:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.