Related papers: CLIMB-3D: Continual Learning for Imbalanced 3D Instance Segmentation

CLIMB-3D: Continual Learning for Imbalanced 3D Instance Segmentation

URL: http://arxiv.org/abs/2502.17429v2
Date: Wed, 21 May 2025 14:24:42 GMT
Title: CLIMB-3D: Continual Learning for Imbalanced 3D Instance Segmentation
Authors: Vishal Thengane, Jean Lahoud, Hisham Cholakkal, Rao Muhammad Anwer, Lu Yin, Xiatian Zhu, Salman Khan,
Abstract summary: We propose a unified framework for textbfCLass-incremental textbfImbalance-aware textbf3DIS.<n>Our approach achieves state-of-the-art results, surpassing prior work by up to 16.76% mAP for instance segmentation and approximately 30% mIoU for semantic segmentation.
Score: 67.36817440834251
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While 3D instance segmentation (3DIS) has advanced significantly, existing methods typically assume that all object classes are known in advance and are uniformly distributed. However, this assumption is unrealistic in dynamic, real-world environments where new classes emerge gradually and exhibit natural imbalance. Although some approaches have addressed class emergence, they often overlook class imbalance, resulting in suboptimal performance -- particularly on rare categories. To tackle this challenge, we propose CLIMB-3D, a unified framework for \textbf{CL}ass-incremental \textbf{Imb}alance-aware \textbf{3D}IS. Building upon established exemplar replay (ER) strategies, we show that ER alone is insufficient to achieve robust performance under constrained memory conditions. To mitigate this, we introduce a novel pseudo-label generator (PLG) that extends supervision to previously learned categories by leveraging predictions from a frozen prior model. Despite its promise, PLG tends to bias towards frequent classes. Therefore, we propose a class-balanced re-weighting (CBR) scheme, that estimates object frequencies from pseudo-labels and dynamically adjusts training bias -- without requiring access to past data. We design and evaluate three incremental scenarios for 3DIS on the challenging ScanNet200 dataset, and additionally on semantic segmentation on ScanNetV2. Our approach achieves state-of-the-art results, surpassing prior work by up to 16.76\% mAP for instance segmentation and approximately 30\% mIoU for semantic segmentation, demonstrating strong generalization across both frequent and rare classes.

Related papers

ViRN: Variational Inference and Distribution Trilateration for Long-Tailed Continual Representation Learning [6.253882111488726]
ViRN is a novel framework that integrates variational inference with distributional trilateration for robust long-tailed learning.<n> evaluated on six long-tailed classification benchmarks, including speech (e.g., rare acoustic events, accents) and image tasks.<n> achieves a 10.24% average accuracy gain over state-of-the-art methods.
arXiv Detail & Related papers (2025-07-23T10:04:30Z)
ProtoGuard-guided PROPEL: Class-Aware Prototype Enhancement and Progressive Labeling for Incremental 3D Point Cloud Segmentation [9.578322021478426]
offline-trained segmentation models may lead to catastrophic forgetting of previously seen classes.<n>ProtoGuard and PROPEL are proposed to address the problem of catastrophic forgetting.<n>Our approach achieves remarkable results on both the S3DIS and ScanNet datasets.
arXiv Detail & Related papers (2025-04-02T11:58:04Z)
InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception [17.530797215534456]
3D scene understanding has become an essential area of research with applications in autonomous driving, robotics, and augmented reality.<n>We propose InstanceGaussian, a method that jointly learns appearance and semantic features while adaptively aggregating instances.<n>Our approach achieves state-of-the-art performance in category-agnostic, open-vocabulary 3D point-level segmentation.
arXiv Detail & Related papers (2024-11-28T16:08:36Z)
Covariance-based Space Regularization for Few-shot Class Incremental Learning [25.435192867105552]
Few-shot Class Incremental Learning (FSCIL) requires the model to continually learn new classes with limited labeled data. Due to the limited data in incremental sessions, models are prone to overfitting new classes and suffering catastrophic forgetting of base classes. Recent advancements resort to prototype-based approaches to constrain the base class distribution and learn discriminative representations of new classes.
arXiv Detail & Related papers (2024-11-02T08:03:04Z)
TeFF: Tracking-enhanced Forgetting-free Few-shot 3D LiDAR Semantic Segmentation [10.628870775939161]
This paper addresses the limitations of current few-shot semantic segmentation by exploiting the temporal continuity of LiDAR data. We employ a tracking model to generate pseudo-ground-truths from a sequence of LiDAR frames, enhancing the dataset's ability to learn on novel classes. We incorporate LoRA, a technique that reduces the number of trainable parameters, thereby preserving the model's performance on base classes while improving its adaptability to novel classes.
arXiv Detail & Related papers (2024-08-28T09:18:36Z)
Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation [50.51125319374404]
We propose a novel self-training network InsTeacher3D to explore and exploit pure instance knowledge from unlabeled data. Experimental results on multiple large-scale datasets show that the InsTeacher3D significantly outperforms prior state-of-the-art semi-supervised approaches.
arXiv Detail & Related papers (2024-06-24T16:35:58Z)
Tendency-driven Mutual Exclusivity for Weakly Supervised Incremental Semantic Segmentation [56.1776710527814]
Weakly Incremental Learning for Semantic (WILSS) leverages a pre-trained segmentation model to segment new classes using cost-effective and readily available image-level labels. A prevailing way to solve WILSS is the generation of seed areas for each new class, serving as a form of pixel-level supervision. We propose an innovative, tendency-driven relationship of mutual exclusivity, meticulously tailored to govern the behavior of the seed areas.
arXiv Detail & Related papers (2024-04-18T08:23:24Z)
OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental Learning [57.43911113915546]
Few-Shot Class-Incremental Learning (FSCIL) introduces a paradigm in which the problem space expands with limited data. FSCIL methods inherently face the challenge of catastrophic forgetting as data arrives incrementally. We propose the OrCo framework built on two core principles: features' orthogonality in the representation space, and contrastive learning.
arXiv Detail & Related papers (2024-03-27T13:30:48Z)
Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection [111.0991686509715]
We study the class imbalance problem for semi-supervised object detection (SSOD) under more challenging scenarios. We propose a simple yet effective gradient-based sampling framework that tackles the class imbalance problem from the perspective of two types of confirmation biases. Experiments on three proposed sub-tasks, namely MS-COCO, MS-COCO to Object365 and LVIS, suggest that our method outperforms current class imbalanced object detectors by clear margins.
arXiv Detail & Related papers (2024-03-22T11:30:10Z)
Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class. Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z)
Neural Collapse Terminus: A Unified Solution for Class Incremental Learning and Its Variants [166.916517335816]
In this paper, we offer a unified solution to the misalignment dilemma in the three tasks. We propose neural collapse terminus that is a fixed structure with the maximal equiangular inter-class separation for the whole label space. Our method holds the neural collapse optimality in an incremental fashion regardless of data imbalance or data scarcity.
arXiv Detail & Related papers (2023-08-03T13:09:59Z)
Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling [38.07637524378327]
Unsupervised domain adaptation (DA) with the aid of pseudo labeling techniques has emerged as a crucial approach for domain-adaptive 3D object detection. Existing DA methods suffer from a substantial drop in performance when applied to a multi-class training setting. We propose a novel ReDB framework tailored for learning to detect all classes at once.
arXiv Detail & Related papers (2023-07-16T04:34:11Z)
S3C: Self-Supervised Stochastic Classifiers for Few-Shot Class-Incremental Learning [22.243176199188238]
Few-shot class-incremental learning (FSCIL) aims to learn progressively about new classes with very few labeled samples, without forgetting the knowledge of already learnt classes. FSCIL suffers from two major challenges: (i) over-fitting on the new classes due to limited amount of data, (ii) catastrophically forgetting about the old classes due to unavailability of data from these classes in the incremental stages.
arXiv Detail & Related papers (2023-07-05T12:41:46Z)
Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes. Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models. Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z)
Constrained Few-shot Class-incremental Learning [14.646083882851928]
Continually learning new classes from fresh data without forgetting previous knowledge of old classes is a very challenging research problem. We propose C-FSCIL, which is architecturally composed of a frozen meta-learned feature extractor, a trainable fixed-size fully connected layer, and a rewritable dynamically growing memory. C-FSCIL provides three update modes that offer a trade-off between accuracy and compute-memory cost of learning novel classes.
arXiv Detail & Related papers (2022-03-30T18:19:36Z)
Modality-Aware Triplet Hard Mining for Zero-shot Sketch-Based Image Retrieval [51.42470171051007]
This paper tackles the Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) problem from the viewpoint of cross-modality metric learning. By combining two fundamental learning approaches in DML, e.g., classification training and pairwise training, we set up a strong baseline for ZS-SBIR. We show that Modality-Aware Triplet Hard Mining (MATHM) enhances the baseline with three types of pairwise learning.
arXiv Detail & Related papers (2021-12-15T08:36:44Z)
Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios. For dataset bias due to different samplers, we propose shifted batch normalization. Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z)
Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier [68.38233199030908]
Long-tail recognition tackles the natural non-uniformly distributed data in realworld scenarios. While moderns perform well on populated classes, its performance degrades significantly on tail classes. Deep-RTC is proposed as a new solution to the long-tail problem, combining realism with hierarchical predictions.
arXiv Detail & Related papers (2020-07-20T05:57:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.