OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive
Learning
- URL: http://arxiv.org/abs/2311.11666v1
- Date: Mon, 20 Nov 2023 11:04:59 GMT
- Title: OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive
Learning
- Authors: Haiyang Ying, Yixuan Yin, Jinzhi Zhang, Fan Wang, Tao Yu, Ruqi Huang,
Lu Fang
- Abstract summary: We propose OmniSeg3D, an omniversal segmentation method for segmenting anything in 3D all at once.
In tackling the challenges posed by inconsistent 2D segmentations, this framework yields a global consistent 3D feature field.
Experiments demonstrate the effectiveness of our method on high-quality 3D segmentation and accurate hierarchical structure understanding.
- Score: 31.234212614311424
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Towards holistic understanding of 3D scenes, a general 3D segmentation method
is needed that can segment diverse objects without restrictions on object
quantity or categories, while also reflecting the inherent hierarchical
structure. To achieve this, we propose OmniSeg3D, an omniversal segmentation
method aims for segmenting anything in 3D all at once. The key insight is to
lift multi-view inconsistent 2D segmentations into a consistent 3D feature
field through a hierarchical contrastive learning framework, which is
accomplished by two steps. Firstly, we design a novel hierarchical
representation based on category-agnostic 2D segmentations to model the
multi-level relationship among pixels. Secondly, image features rendered from
the 3D feature field are clustered at different levels, which can be further
drawn closer or pushed apart according to the hierarchical relationship between
different levels. In tackling the challenges posed by inconsistent 2D
segmentations, this framework yields a global consistent 3D feature field,
which further enables hierarchical segmentation, multi-object selection, and
global discretization. Extensive experiments demonstrate the effectiveness of
our method on high-quality 3D segmentation and accurate hierarchical structure
understanding. A graphical user interface further facilitates flexible
interaction for omniversal 3D segmentation.
Related papers
- CUS3D :CLIP-based Unsupervised 3D Segmentation via Object-level Denoise [9.12768731317489]
We propose a novel distillation learning framework named CUS3D.
An object-level denosing projection module is designed to screen out the noise'' and ensure more accurate 3D feature.
Based on the obtained features, a multimodal distillation learning module is designed to align the 3D feature with CLIP semantic feature space.
arXiv Detail & Related papers (2024-09-21T02:17:35Z) - Few-Shot 3D Volumetric Segmentation with Multi-Surrogate Fusion [31.736235596070937]
We present MSFSeg, a novel few-shot 3D segmentation framework with a lightweight multi-surrogate fusion (MSF)
MSFSeg is able to automatically segment unseen 3D objects/organs (during training) provided with one or a few annotated 2D slices or 3D sequence segments.
Our proposed MSF module mines comprehensive and diversified correlations between unlabeled and the few labeled slices/sequences through multiple designated surrogates.
arXiv Detail & Related papers (2024-08-26T17:15:37Z) - View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields [52.08335264414515]
We learn a novel feature field within a Neural Radiance Field (NeRF) representing a 3D scene.
Our method takes view-inconsistent multi-granularity 2D segmentations as input and produces a hierarchy of 3D-consistent segmentations as output.
We evaluate our method and several baselines on synthetic datasets with multi-view images and multi-granular segmentation, showcasing improved accuracy and viewpoint-consistency.
arXiv Detail & Related papers (2024-05-30T04:14:58Z) - SAI3D: Segment Any Instance in 3D Scenes [68.57002591841034]
We introduce SAI3D, a novel zero-shot 3D instance segmentation approach.
Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations.
Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
arXiv Detail & Related papers (2023-12-17T09:05:47Z) - DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - A One Stop 3D Target Reconstruction and multilevel Segmentation Method [0.0]
We propose an open-source one stop 3D target reconstruction and multilevel segmentation framework (OSTRA)
OSTRA performs segmentation on 2D images, tracks multiple instances with segmentation labels in the image sequence, and then reconstructs labelled 3D objects or multiple parts with Multi-View Stereo (MVS) or RGBD-based 3D reconstruction methods.
Our method opens up a new avenue for reconstructing 3D targets embedded with rich multi-scale segmentation information in complex scenes.
arXiv Detail & Related papers (2023-08-14T07:12:31Z) - Lowis3D: Language-Driven Open-World Instance-Level 3D Scene
Understanding [57.47315482494805]
Open-world instance-level scene understanding aims to locate and recognize unseen object categories that are not present in the annotated dataset.
This task is challenging because the model needs to both localize novel 3D objects and infer their semantic categories.
We propose to harness pre-trained vision-language (VL) foundation models that encode extensive knowledge from image-text pairs to generate captions for 3D scenes.
arXiv Detail & Related papers (2023-08-01T07:50:14Z) - ONeRF: Unsupervised 3D Object Segmentation from Multiple Views [59.445957699136564]
ONeRF is a method that automatically segments and reconstructs object instances in 3D from multi-view RGB images without any additional manual annotations.
The segmented 3D objects are represented using separate Neural Radiance Fields (NeRFs) which allow for various 3D scene editing and novel view rendering.
arXiv Detail & Related papers (2022-11-22T06:19:37Z) - Learning Hyperbolic Representations for Unsupervised 3D Segmentation [3.516233423854171]
We propose learning effective representations of 3D patches for unsupervised segmentation through a variational autoencoder (VAE) with a hyperbolic latent space and a proposed gyroplane convolutional layer.
We demonstrate the effectiveness of our hyperbolic representations for unsupervised 3D segmentation on a hierarchical toy dataset, BraTS whole tumor dataset, and cryogenic electron microscopy data.
arXiv Detail & Related papers (2020-12-03T02:15:31Z) - Fine-Grained 3D Shape Classification with Hierarchical Part-View
Attentions [70.0171362989609]
We propose a novel fine-grained 3D shape classification method named FG3D-Net to capture the fine-grained local details of 3D shapes from multiple rendered views.
Our results under the fine-grained 3D shape dataset show that our method outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2020-05-26T06:53:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.