ONeRF: Unsupervised 3D Object Segmentation from Multiple Views
- URL: http://arxiv.org/abs/2211.12038v1
- Date: Tue, 22 Nov 2022 06:19:37 GMT
- Title: ONeRF: Unsupervised 3D Object Segmentation from Multiple Views
- Authors: Shengnan Liang, Yichen Liu, Shangzhe Wu, Yu-Wing Tai, Chi-Keung Tang
- Abstract summary: ONeRF is a method that automatically segments and reconstructs object instances in 3D from multi-view RGB images without any additional manual annotations.
The segmented 3D objects are represented using separate Neural Radiance Fields (NeRFs) which allow for various 3D scene editing and novel view rendering.
- Score: 59.445957699136564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present ONeRF, a method that automatically segments and reconstructs
object instances in 3D from multi-view RGB images without any additional manual
annotations. The segmented 3D objects are represented using separate Neural
Radiance Fields (NeRFs) which allow for various 3D scene editing and novel view
rendering. At the core of our method is an unsupervised approach using the
iterative Expectation-Maximization algorithm, which effectively aggregates 2D
visual features and the corresponding 3D cues from multi-views for joint 3D
object segmentation and reconstruction. Unlike existing approaches that can
only handle simple objects, our method produces segmented full 3D NeRFs of
individual objects with complex shapes, topologies and appearance. The
segmented ONeRfs enable a range of 3D scene editing, such as object
transformation, insertion and deletion.
Related papers
- Part123: Part-aware 3D Reconstruction from a Single-view Image [54.589723979757515]
Part123 is a novel framework for part-aware 3D reconstruction from a single-view image.
We introduce contrastive learning into a neural rendering framework to learn a part-aware feature space.
A clustering-based algorithm is also developed to automatically derive 3D part segmentation results from the reconstructed models.
arXiv Detail & Related papers (2024-05-27T07:10:21Z) - Obj-NeRF: Extract Object NeRFs from Multi-view Images [7.669778218573394]
We propose -NeRF, a comprehensive pipeline that recovers the 3D geometry of a specific object from multi-view images using a single prompt.
We also apply -NeRF to various applications, including object removal, rotation, replacement, and recoloring.
arXiv Detail & Related papers (2023-11-26T13:15:37Z) - DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - Geometry Aware Field-to-field Transformations for 3D Semantic
Segmentation [48.307734886370014]
We present a novel approach to perform 3D semantic segmentation solely from 2D supervision by leveraging Neural Radiance Fields (NeRFs)
By extracting features along a surface point cloud, we achieve a compact representation of the scene which is sample-efficient and conducive to 3D reasoning.
arXiv Detail & Related papers (2023-10-08T11:48:19Z) - A One Stop 3D Target Reconstruction and multilevel Segmentation Method [0.0]
We propose an open-source one stop 3D target reconstruction and multilevel segmentation framework (OSTRA)
OSTRA performs segmentation on 2D images, tracks multiple instances with segmentation labels in the image sequence, and then reconstructs labelled 3D objects or multiple parts with Multi-View Stereo (MVS) or RGBD-based 3D reconstruction methods.
Our method opens up a new avenue for reconstructing 3D targets embedded with rich multi-scale segmentation information in complex scenes.
arXiv Detail & Related papers (2023-08-14T07:12:31Z) - Anything-3D: Towards Single-view Anything Reconstruction in the Wild [61.090129285205805]
We introduce Anything-3D, a methodical framework that ingeniously combines a series of visual-language models and the Segment-Anything object segmentation model.
Our approach employs a BLIP model to generate textural descriptions, utilize the Segment-Anything model for the effective extraction of objects of interest, and leverages a text-to-image diffusion model to lift object into a neural radiance field.
arXiv Detail & Related papers (2023-04-19T16:39:51Z) - Unsupervised Multi-View Object Segmentation Using Radiance Field
Propagation [55.9577535403381]
We present a novel approach to segmenting objects in 3D during reconstruction given only unlabeled multi-view images of a scene.
The core of our method is a novel propagation strategy for individual objects' radiance fields with a bidirectional photometric loss.
To the best of our knowledge, RFP is the first unsupervised approach for tackling 3D scene object segmentation for neural radiance field (NeRF)
arXiv Detail & Related papers (2022-10-02T11:14:23Z) - Decomposing 3D Scenes into Objects via Unsupervised Volume Segmentation [26.868351498722884]
We present ObSuRF, a method which turns a single image of a scene into a 3D model represented as a set of Neural Radiance Fields (NeRFs)
We make learning more computationally efficient by deriving a novel loss, which allows training NeRFs on RGB-D inputs without explicit ray marching.
arXiv Detail & Related papers (2021-04-02T16:59:29Z) - CoReNet: Coherent 3D scene reconstruction from a single RGB image [43.74240268086773]
We build on advances in deep learning to reconstruct the shape of a single object given only one RBG image as input.
We propose three extensions: (1) ray-traced skip connections that propagate local 2D information to the output 3D volume in a physically correct manner; (2) a hybrid 3D volume representation that enables building translation equivariant models; and (3) a reconstruction loss tailored to capture overall object geometry.
We reconstruct all objects jointly in one pass, producing a coherent reconstruction, where all objects live in a single consistent 3D coordinate frame relative to the camera and they do not intersect in 3D space.
arXiv Detail & Related papers (2020-04-27T17:53:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.