Interactive Segment Anything NeRF with Feature Imitation
- URL: http://arxiv.org/abs/2305.16233v1
- Date: Thu, 25 May 2023 16:44:51 GMT
- Title: Interactive Segment Anything NeRF with Feature Imitation
- Authors: Xiaokang Chen, Jiaxiang Tang, Diwen Wan, Jingbo Wang, Gang Zeng
- Abstract summary: We propose to imitate the backbone feature of off-the-shelf perception models to achieve zero-shot semantic segmentation with NeRF.
Our framework reformulates the segmentation process by directly rendering semantic features and only applying the decoder from perception models.
Furthermore, we can project the learned semantics onto extracted mesh surfaces for real-time interaction.
- Score: 20.972098365110426
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates the potential of enhancing Neural Radiance Fields
(NeRF) with semantics to expand their applications. Although NeRF has been
proven useful in real-world applications like VR and digital creation, the lack
of semantics hinders interaction with objects in complex scenes. We propose to
imitate the backbone feature of off-the-shelf perception models to achieve
zero-shot semantic segmentation with NeRF. Our framework reformulates the
segmentation process by directly rendering semantic features and only applying
the decoder from perception models. This eliminates the need for expensive
backbones and benefits 3D consistency. Furthermore, we can project the learned
semantics onto extracted mesh surfaces for real-time interaction. With the
state-of-the-art Segment Anything Model (SAM), our framework accelerates
segmentation by 16 times with comparable mask quality. The experimental results
demonstrate the efficacy and computational advantages of our approach. Project
page: \url{https://me.kiui.moe/san/}.
Related papers
- DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors [4.697267141773321]
We present DreamHOI, a novel method for zero-shot synthesis of human-object interactions (HOIs)
We leverage text-to-image diffusion models trained on billions of image-caption pairs to generate realistic HOIs.
We validate our approach through extensive experiments, demonstrating its effectiveness in generating realistic HOIs.
arXiv Detail & Related papers (2024-09-12T17:59:49Z) - Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling [70.34875558830241]
We present a way for learning a-temporal (4D) embedding, based on semantic semantic gears to allow for stratified modeling of dynamic regions of rendering the scene.
At the same time, almost for free, our tracking approach enables free-viewpoint of interest - a functionality not yet achieved by existing NeRF-based methods.
arXiv Detail & Related papers (2024-06-06T03:37:39Z) - 3D Visibility-aware Generalizable Neural Radiance Fields for Interacting
Hands [51.305421495638434]
Neural radiance fields (NeRFs) are promising 3D representations for scenes, objects, and humans.
This paper proposes a generalizable visibility-aware NeRF framework for interacting hands.
Experiments on the Interhand2.6M dataset demonstrate that our proposed VA-NeRF outperforms conventional NeRFs significantly.
arXiv Detail & Related papers (2024-01-02T00:42:06Z) - GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding [101.32590239809113]
Generalized Perception NeRF (GP-NeRF) is a novel pipeline that makes the widely used segmentation model and NeRF work compatibly under a unified framework.
We propose two self-distillation mechanisms, i.e., the Semantic Distill Loss and the Depth-Guided Semantic Distill Loss, to enhance the discrimination and quality of the semantic field.
arXiv Detail & Related papers (2023-11-20T15:59:41Z) - Interactive Segmentation of Radiance Fields [7.9020917073764405]
Mixed reality on personal spaces needs understanding and manipulating scenes represented as RFs.
We present the ISRF method to interactively segment objects with fine structure and appearance.
arXiv Detail & Related papers (2022-12-27T16:33:19Z) - SegNeRF: 3D Part Segmentation with Neural Radiance Fields [63.12841224024818]
SegNeRF is a neural field representation that integrates a semantic field along with the usual radiance field.
SegNeRF is capable of simultaneously predicting geometry, appearance, and semantic information from posed images, even for unseen objects.
SegNeRF is able to generate an explicit 3D model from a single image of an object taken in the wild, with its corresponding part segmentation.
arXiv Detail & Related papers (2022-11-21T07:16:03Z) - NeRF-SOS: Any-View Self-supervised Object Segmentation from Complex
Real-World Scenes [80.59831861186227]
This paper carries out the exploration of self-supervised learning for object segmentation using NeRF for complex real-world scenes.
Our framework, called NeRF with Self-supervised Object NeRF-SOS, encourages NeRF models to distill compact geometry-aware segmentation clusters.
It consistently surpasses other 2D-based self-supervised baselines and predicts finer semantics masks than existing supervised counterparts.
arXiv Detail & Related papers (2022-09-19T06:03:17Z) - Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields [32.200557554874784]
This paper provides a new approach to scene understanding, by leveraging the recent progress on implicit 3D representation and neural rendering.
Building upon the great success of Neural Radiance Fields (NeRFs), we introduce Scene-Property Synthesis with NeRF.
We facilitate addressing a variety of scene understanding tasks under a unified framework, including semantic segmentation, surface normal estimation, reshading, keypoint detection, and edge detection.
arXiv Detail & Related papers (2022-06-09T17:59:50Z) - Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation [61.8546794105462]
We propose Semantic-aware Speaking Portrait NeRF (SSP-NeRF), which creates delicate audio-driven portraits using one unified set of NeRF.
We first propose a Semantic-Aware Dynamic Ray Sampling module with an additional parsing branch that facilitates audio-driven volume rendering.
To enable portrait rendering in one unified neural radiance field, a Torso Deformation module is designed to stabilize the large-scale non-rigid torso motions.
arXiv Detail & Related papers (2022-01-19T18:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.