UniC-Lift: Unified 3D Instance Segmentation via Contrastive Learning
- URL: http://arxiv.org/abs/2512.24763v1
- Date: Wed, 31 Dec 2025 10:20:01 GMT
- Title: UniC-Lift: Unified 3D Instance Segmentation via Contrastive Learning
- Authors: Ankit Dhiman, Srinath R, Jaswanth Reddy, Lokesh R Boregowda, Venkatesh Babu Radhakrishnan,
- Abstract summary: 3D Gaussian Splatting (3DGS) and Neural Radiance Fields (NeRF) have advanced novel-view synthesis.<n>Recent methods extend multi-view 2D segmentation to 3D, enabling instance/semantic segmentation for better scene understanding.<n>Key challenge is the inconsistency of 2D instance labels across views, leading to poor 3D predictions.<n>We propose a unified framework that merges these steps, reducing training time and improving performance by introducing a learnable feature embedding for segmentation in Gaussian primitives.
- Score: 6.502142457981839
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 3D Gaussian Splatting (3DGS) and Neural Radiance Fields (NeRF) have advanced novel-view synthesis. Recent methods extend multi-view 2D segmentation to 3D, enabling instance/semantic segmentation for better scene understanding. A key challenge is the inconsistency of 2D instance labels across views, leading to poor 3D predictions. Existing methods use a two-stage approach in which some rely on contrastive learning with hyperparameter-sensitive clustering, while others preprocess labels for consistency. We propose a unified framework that merges these steps, reducing training time and improving performance by introducing a learnable feature embedding for segmentation in Gaussian primitives. This embedding is then efficiently decoded into instance labels through a novel "Embedding-to-Label" process, effectively integrating the optimization. While this unified framework offers substantial benefits, we observed artifacts at the object boundaries. To address the object boundary issues, we propose hard-mining samples along these boundaries. However, directly applying hard mining to the feature embeddings proved unstable. Therefore, we apply a linear layer to the rasterized feature embeddings before calculating the triplet loss, which stabilizes training and significantly improves performance. Our method outperforms baselines qualitatively and quantitatively on the ScanNet, Replica3D, and Messy-Rooms datasets.
Related papers
- Binary-Gaussian: Compact and Progressive Representation for 3D Gaussian Segmentation [83.90109373769614]
3D Gaussian Splatting (3D-GS) has emerged as an efficient 3D representation and a promising foundation for semantic tasks like segmentation.<n>We propose a coarse-to-fine binary encoding scheme for per-Gaussian category representation, which compresses each feature into a single integer via the binary-to-decimal mapping.<n>We further design a progressive training strategy that decomposes panoptic segmentation into a series of independent sub-tasks, reducing inter-class conflicts and thereby enhancing fine-grained segmentation capability.
arXiv Detail & Related papers (2025-11-30T15:51:30Z) - Class-agnostic 3D Segmentation by Granularity-Consistent Automatic 2D Mask Tracking [10.223105883919278]
We introduce a Granularity-Consistent automatic 2D Mask Tracking approach that maintains temporal correspondences across frames.<n>Our method effectively generated consistent and accurate 3D segmentations.
arXiv Detail & Related papers (2025-11-02T03:52:42Z) - ALISE: Annotation-Free LiDAR Instance Segmentation for Autonomous Driving [9.361724251990154]
We introduce ALISE, a novel framework that performs LiDAR instance segmentation without any annotations.<n>Our approach starts by employing Vision Foundation Models (VFMs), guided by text and images, to produce initial pseudo-labels.<n>We then refine these labels through a dedicated manual-temporal voting module, which combines 2D and 3D semantics for both offline and online optimization.<n>This comprehensive design results in significant performance gains, establishing a new state-of-the-art for unsupervised 3D instance segmentation.
arXiv Detail & Related papers (2025-10-07T10:15:18Z) - Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting [86.15347226865826]
We design a new end-to-end object-aware lifting approach, named Unified-Lift.<n>We augment each Gaussian point with an additional Gaussian-level feature learned using a contrastive loss to encode instance information.<n>We conduct experiments on three benchmarks: LERF-Masked, Replica, and Messy Rooms.
arXiv Detail & Related papers (2025-03-18T08:42:23Z) - Label-Efficient LiDAR Panoptic Segmentation [22.440065488051047]
Limited-Label LiDAR Panoptic (L3PS)<n>We develop a label-efficient 2D network to generate panoptic pseudo-labels from annotated images.<n>We then introduce a novel 3D refinement module that capitalizes on the geometric properties of point clouds.
arXiv Detail & Related papers (2025-03-04T07:58:15Z) - Bayesian Self-Training for Semi-Supervised 3D Segmentation [59.544558398992386]
3D segmentation is a core problem in computer vision.
densely labeling 3D point clouds to employ fully-supervised training remains too labor intensive and expensive.
Semi-supervised training provides a more practical alternative, where only a small set of labeled data is given, accompanied by a larger unlabeled set.
arXiv Detail & Related papers (2024-09-12T14:54:31Z) - Towards Modality-agnostic Label-efficient Segmentation with Entropy-Regularized Distribution Alignment [62.73503467108322]
This topic is widely studied in 3D point cloud segmentation due to the difficulty of annotating point clouds densely.
Until recently, pseudo-labels have been widely employed to facilitate training with limited ground-truth labels.
Existing pseudo-labeling approaches could suffer heavily from the noises and variations in unlabelled data.
We propose a novel learning strategy to regularize the pseudo-labels generated for training, thus effectively narrowing the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2024-08-29T13:31:15Z) - Weakly Supervised 3D Instance Segmentation without Instance-level
Annotations [57.615325809883636]
3D semantic scene understanding tasks have achieved great success with the emergence of deep learning, but often require a huge amount of manually annotated training data.
We propose the first weakly-supervised 3D instance segmentation method that only requires categorical semantic labels as supervision.
By generating pseudo instance labels from categorical semantic labels, our designed approach can also assist existing methods for learning 3D instance segmentation at reduced annotation cost.
arXiv Detail & Related papers (2023-08-03T12:30:52Z) - Collaborative Propagation on Multiple Instance Graphs for 3D Instance
Segmentation with Single-point Supervision [63.429704654271475]
We propose a novel weakly supervised method RWSeg that only requires labeling one object with one point.
With these sparse weak labels, we introduce a unified framework with two branches to propagate semantic and instance information.
Specifically, we propose a Cross-graph Competing Random Walks (CRW) algorithm that encourages competition among different instance graphs.
arXiv Detail & Related papers (2022-08-10T02:14:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.