A Method for Discovering Novel Classes in Tabular Data
- URL: http://arxiv.org/abs/2209.01217v1
- Date: Fri, 2 Sep 2022 11:45:24 GMT
- Title: A Method for Discovering Novel Classes in Tabular Data
- Authors: Colin Troisemaine and Joachim Flocon-Cholet and St\'ephane Gosselin
and Sandrine Vaton and Alexandre Reiffers-Masson and Vincent Lemaire
- Abstract summary: In Novel Class Discovery (NCD), the goal is to find new classes in an unlabeled set given a labeled set of known but different classes.
We show a way to extract knowledge from already known classes to guide the discovery process of novel classes in heterogeneous data.
- Score: 54.11148718494725
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In Novel Class Discovery (NCD), the goal is to find new classes in an
unlabeled set given a labeled set of known but different classes. While NCD has
recently gained attention from the community, no framework has yet been
proposed for heterogeneous tabular data, despite being a very common
representation of data. In this paper, we propose TabularNCD, a new method for
discovering novel classes in tabular data. We show a way to extract knowledge
from already known classes to guide the discovery process of novel classes in
the context of tabular data which contains heterogeneous variables. A part of
this process is done by a new method for defining pseudo labels, and we follow
recent findings in Multi-Task Learning to optimize a joint objective function.
Our method demonstrates that NCD is not only applicable to images but also to
heterogeneous tabular data. Extensive experiments are conducted to evaluate our
method and demonstrate its effectiveness against 3 competitors on 7 diverse
public classification datasets.
Related papers
- Active Generalized Category Discovery [60.69060965936214]
Generalized Category Discovery (GCD) endeavors to cluster unlabeled samples from both novel and old classes.
We take the spirit of active learning and propose a new setting called Active Generalized Category Discovery (AGCD)
Our method achieves state-of-the-art performance on both generic and fine-grained datasets.
arXiv Detail & Related papers (2024-03-07T07:12:24Z) - A Practical Approach to Novel Class Discovery in Tabular Data [38.41548083078336]
Novel Class Discovery (NCD) is a problem of extracting knowledge from a labeled set of known classes to accurately partition an unlabeled set of novel classes.
In this work, we propose to tune the hyper parameters of NCD methods by adapting the $k$-fold cross-validation process and hiding some of the known classes in each fold.
We find that the latent space of this method can be used to reliably estimate the number of novel classes.
arXiv Detail & Related papers (2023-11-09T15:24:44Z) - MetaGCD: Learning to Continually Learn in Generalized Category Discovery [26.732455383707798]
We consider a real-world scenario where a model that is trained on pre-defined classes continually encounters unlabeled data.
The goal is to continually discover novel classes while maintaining the performance in known classes.
We propose an approach, called MetaGCD, to learn how to incrementally discover with less forgetting.
arXiv Detail & Related papers (2023-08-21T22:16:49Z) - An Interactive Interface for Novel Class Discovery in Tabular Data [54.11148718494725]
Novel Class Discovery (NCD) is the problem of trying to discover novel classes in an unlabeled set, given a labeled set of different but related classes.
The majority of NCD methods proposed so far only deal with image data.
This interface allows a domain expert to easily run state-of-the-art algorithms for NCD in tabular data.
arXiv Detail & Related papers (2023-06-22T14:32:53Z) - Dynamic Conceptional Contrastive Learning for Generalized Category
Discovery [76.82327473338734]
Generalized category discovery (GCD) aims to automatically cluster partially labeled data.
Unlabeled data contain instances that are not only from known categories of the labeled data but also from novel categories.
One effective way for GCD is applying self-supervised learning to learn discriminate representation for unlabeled data.
We propose a Dynamic Conceptional Contrastive Learning framework, which can effectively improve clustering accuracy.
arXiv Detail & Related papers (2023-03-30T14:04:39Z) - D\'ecouvrir de nouvelles classes dans des donn\'ees tabulaires [54.11148718494725]
In Novel Class Discovery (NCD), the goal is to find new classes in an unlabeled set given a labeled set of known but different classes.
We show a way to extract knowledge from already known classes to guide the discovery process of novel classes in heterogeneous data.
arXiv Detail & Related papers (2022-11-28T09:48:55Z) - Class-incremental Novel Class Discovery [76.35226130521758]
We study the new task of class-incremental Novel Class Discovery (class-iNCD)
We propose a novel approach for class-iNCD which prevents forgetting of past information about the base classes.
Our experiments, conducted on three common benchmarks, demonstrate that our method significantly outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2022-07-18T13:49:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.