A Practical Approach to Novel Class Discovery in Tabular Data
- URL: http://arxiv.org/abs/2311.05440v3
- Date: Mon, 3 Jun 2024 08:49:54 GMT
- Title: A Practical Approach to Novel Class Discovery in Tabular Data
- Authors: Colin Troisemaine, Alexandre Reiffers-Masson, Stéphane Gosselin, Vincent Lemaire, Sandrine Vaton,
- Abstract summary: Novel Class Discovery (NCD) is a problem of extracting knowledge from a labeled set of known classes to accurately partition an unlabeled set of novel classes.
In this work, we propose to tune the hyper parameters of NCD methods by adapting the $k$-fold cross-validation process and hiding some of the known classes in each fold.
We find that the latent space of this method can be used to reliably estimate the number of novel classes.
- Score: 38.41548083078336
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The problem of Novel Class Discovery (NCD) consists in extracting knowledge from a labeled set of known classes to accurately partition an unlabeled set of novel classes. While NCD has recently received a lot of attention from the community, it is often solved on computer vision problems and under unrealistic conditions. In particular, the number of novel classes is usually assumed to be known in advance, and their labels are sometimes used to tune hyperparameters. Methods that rely on these assumptions are not applicable in real-world scenarios. In this work, we focus on solving NCD in tabular data when no prior knowledge of the novel classes is available. To this end, we propose to tune the hyperparameters of NCD methods by adapting the $k$-fold cross-validation process and hiding some of the known classes in each fold. Since we have found that methods with too many hyperparameters are likely to overfit these hidden classes, we define a simple deep NCD model. This method is composed of only the essential elements necessary for the NCD problem and performs impressively well under realistic conditions. Furthermore, we find that the latent space of this method can be used to reliably estimate the number of novel classes. Additionally, we adapt two unsupervised clustering algorithms ($k$-means and Spectral Clustering) to leverage the knowledge of the known classes. Extensive experiments are conducted on 7 tabular datasets and demonstrate the effectiveness of the proposed method and hyperparameter tuning process, and show that the NCD problem can be solved without relying on knowledge from the novel classes.
Related papers
- Happy: A Debiased Learning Framework for Continual Generalized Category Discovery [54.54153155039062]
This paper explores the underexplored task of Continual Generalized Category Discovery (C-GCD)
C-GCD aims to incrementally discover new classes from unlabeled data while maintaining the ability to recognize previously learned classes.
We introduce a debiased learning framework, namely Happy, characterized by Hardness-aware prototype sampling and soft entropy regularization.
arXiv Detail & Related papers (2024-10-09T04:18:51Z) - When and How Does Known Class Help Discover Unknown Ones? Provable
Understanding Through Spectral Analysis [35.57142091571271]
Novel Class Discovery (NCD) aims at inferring novel classes in an unlabeled set by leveraging prior knowledge from a labeled set with known classes.
This paper bridges the gap by providing an analytical framework to formalize and investigate when and how known classes can help discover novel classes.
arXiv Detail & Related papers (2023-08-09T15:27:21Z) - An Interactive Interface for Novel Class Discovery in Tabular Data [54.11148718494725]
Novel Class Discovery (NCD) is the problem of trying to discover novel classes in an unlabeled set, given a labeled set of different but related classes.
The majority of NCD methods proposed so far only deal with image data.
This interface allows a domain expert to easily run state-of-the-art algorithms for NCD in tabular data.
arXiv Detail & Related papers (2023-06-22T14:32:53Z) - NEV-NCD: Negative Learning, Entropy, and Variance regularization based
novel action categories discovery [23.17093125627668]
Novel Categories Discovery (NCD) facilitates learning from a partially annotated label space.
We propose a novel single-stage joint optimization-based NCD method, Negative learning, Entropy, and Variance regularization NCD.
We demonstrate the efficacy of NEV-NCD in previously unexplored NCD applications of video action recognition.
arXiv Detail & Related papers (2023-04-14T19:20:26Z) - Large-scale Pre-trained Models are Surprisingly Strong in Incremental Novel Class Discovery [76.63807209414789]
We challenge the status quo in class-iNCD and propose a learning paradigm where class discovery occurs continuously and truly unsupervisedly.
We propose simple baselines, composed of a frozen PTM backbone and a learnable linear classifier, that are not only simple to implement but also resilient under longer learning scenarios.
arXiv Detail & Related papers (2023-03-28T13:47:16Z) - D\'ecouvrir de nouvelles classes dans des donn\'ees tabulaires [54.11148718494725]
In Novel Class Discovery (NCD), the goal is to find new classes in an unlabeled set given a labeled set of known but different classes.
We show a way to extract knowledge from already known classes to guide the discovery process of novel classes in heterogeneous data.
arXiv Detail & Related papers (2022-11-28T09:48:55Z) - A Method for Discovering Novel Classes in Tabular Data [54.11148718494725]
In Novel Class Discovery (NCD), the goal is to find new classes in an unlabeled set given a labeled set of known but different classes.
We show a way to extract knowledge from already known classes to guide the discovery process of novel classes in heterogeneous data.
arXiv Detail & Related papers (2022-09-02T11:45:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.