NeurNCD: Novel Class Discovery via Implicit Neural Representation
- URL: http://arxiv.org/abs/2506.06412v1
- Date: Fri, 06 Jun 2025 16:43:34 GMT
- Title: NeurNCD: Novel Class Discovery via Implicit Neural Representation
- Authors: Junming Wang, Yi Shi,
- Abstract summary: NeurNCD is a versatile and data-efficient framework for novel class discovery.<n>Our framework achieves superior segmentation performance in both open and closed-world settings.<n>Our method significantly outperforms state-of-the-art approaches on the NYUv2 and Replica datasets.
- Score: 4.498082064000176
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Discovering novel classes in open-world settings is crucial for real-world applications. Traditional explicit representations, such as object descriptors or 3D segmentation maps, are constrained by their discrete, hole-prone, and noisy nature, which hinders accurate novel class discovery. To address these challenges, we introduce NeurNCD, the first versatile and data-efficient framework for novel class discovery that employs the meticulously designed Embedding-NeRF model combined with KL divergence as a substitute for traditional explicit 3D segmentation maps to aggregate semantic embedding and entropy in visual embedding space. NeurNCD also integrates several key components, including feature query, feature modulation and clustering, facilitating efficient feature augmentation and information exchange between the pre-trained semantic segmentation network and implicit neural representations. As a result, our framework achieves superior segmentation performance in both open and closed-world settings without relying on densely labelled datasets for supervised training or human interaction to generate sparse label supervision. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art approaches on the NYUv2 and Replica datasets.
Related papers
- Towards Open-World Human Action Segmentation Using Graph Convolutional Networks [6.167678490008973]
Most existing learning-based methods excel in closed-world action segmentation.<n>We propose a structured framework for detecting and segmenting unseen actions.<n>We evaluate our framework on two challenging human-object recognition datasets.
arXiv Detail & Related papers (2025-07-01T14:00:39Z) - Improving Open-Set Semantic Segmentation in 3D Point Clouds by Conditional Channel Capacity Maximization: Preliminary Results [1.1328543389752008]
We propose a plug and play framework for Open-Set Semantic (O3S)<n>By modeling the segmentation pipeline as a conditional Markov chain, we derive a novel regularizer term dubbed Conditional Channel Capacity Maximization (3CM)<n>We show that 3CM encourages the encoder to retain richer, label-dependent features, thereby enhancing the network's ability to distinguish and segment previously unseen categories.
arXiv Detail & Related papers (2025-05-09T04:12:26Z) - Exclusive Style Removal for Cross Domain Novel Class Discovery [15.868889486516306]
Novel Class Discovery (NCD) is a promising field in open-world learning.
We introduce an exclusive style removal module for extracting style information that is distinctive from the baseline features.
This module is easy to integrate with other NCD methods, acting as a plug-in to improve performance on novel classes with different distributions.
arXiv Detail & Related papers (2024-06-26T07:44:27Z) - Cross-domain Open-world Discovery [3.9199802599782387]
We present CROW, a prototype-based approach that introduces a cluster-then-match strategy enabled by a well-structured representation space of foundation models.
In this way, CROW discovers novel classes by robustly matching clusters with previously seen classes, followed by fine-tuning the representation space.
CROW outperforms alternative baselines, achieving an 8% average performance improvement across 75 experimental settings.
arXiv Detail & Related papers (2024-06-17T11:20:09Z) - Generalized Robot 3D Vision-Language Model with Fast Rendering and Pre-Training Vision-Language Alignment [55.11291053011696]
This work presents a framework for dealing with 3D scene understanding when the labeled scenes are quite limited.<n>To extract knowledge for novel categories from the pre-trained vision-language models, we propose a hierarchical feature-aligned pre-training and knowledge distillation strategy.<n>In the limited reconstruction case, our proposed approach, termed WS3D++, ranks 1st on the large-scale ScanNet benchmark.
arXiv Detail & Related papers (2023-12-01T15:47:04Z) - Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual
Recognition [57.08108545219043]
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision.
Existing literature addresses this challenge by employing local-based representation approaches.
This article proposes a novel model, Robust Saliency-aware Distillation (RSaD), for few-shot fine-grained visual recognition.
arXiv Detail & Related papers (2023-05-12T00:13:17Z) - GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot
Learning [55.79997930181418]
Generalized Zero-Shot Learning aims to recognize images from both the seen and unseen classes by transferring semantic knowledge from seen to unseen classes.
It is a promising solution to take the advantage of generative models to hallucinate realistic unseen samples based on the knowledge learned from the seen classes.
We propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
arXiv Detail & Related papers (2022-07-05T04:04:37Z) - Entity-Conditioned Question Generation for Robust Attention Distribution
in Neural Information Retrieval [51.53892300802014]
We show that supervised neural information retrieval models are prone to learning sparse attention patterns over passage tokens.
Using a novel targeted synthetic data generation method, we teach neural IR to attend more uniformly and robustly to all entities in a given passage.
arXiv Detail & Related papers (2022-04-24T22:36:48Z) - Novel Class Discovery in Semantic Segmentation [104.30729847367104]
We introduce a new setting of Novel Class Discovery in Semantic (NCDSS)
It aims at segmenting unlabeled images containing new classes given prior knowledge from a labeled set of disjoint classes.
In NCDSS, we need to distinguish the objects and background, and to handle the existence of multiple classes within an image.
We propose the Entropy-based Uncertainty Modeling and Self-training (EUMS) framework to overcome noisy pseudo-labels.
arXiv Detail & Related papers (2021-12-03T13:31:59Z) - PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object
Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving.
Current approaches suffer from sparse and partial point clouds of distant and occluded objects.
In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.