Combining Datasets with Different Label Sets for Improved Nucleus
Segmentation and Classification
- URL: http://arxiv.org/abs/2310.03346v1
- Date: Thu, 5 Oct 2023 06:56:54 GMT
- Title: Combining Datasets with Different Label Sets for Improved Nucleus
Segmentation and Classification
- Authors: Amruta Parulekar, Utkarsh Kanwat, Ravi Kant Gupta, Medha Chippa,
Thomas Jacob, Tripti Bameta, Swapnil Rane, Amit Sethi
- Abstract summary: We propose a method to train deep neural networks (DNNs) on multiple datasets where the set of classes are related but not the same.
Specifically, our method is designed to utilize a coarse-to-fine class hierarchy, where the set of classes labeled and annotated in a dataset can be at any level of the hierarchy.
Our results demonstrate that segmentation and classification metrics for the class set used by the test split of a dataset can improve by pre-training on another dataset.
- Score: 2.5016806673359393
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Segmentation and classification of cell nuclei in histopathology images using
deep neural networks (DNNs) can save pathologists' time for diagnosing various
diseases, including cancers, by automating cell counting and morphometric
assessments. It is now well-known that the accuracy of DNNs increases with the
sizes of annotated datasets available for training. Although multiple datasets
of histopathology images with nuclear annotations and class labels have been
made publicly available, the set of class labels differ across these datasets.
We propose a method to train DNNs for instance segmentation and classification
on multiple datasets where the set of classes across the datasets are related
but not the same. Specifically, our method is designed to utilize a
coarse-to-fine class hierarchy, where the set of classes labeled and annotated
in a dataset can be at any level of the hierarchy, as long as the classes are
mutually exclusive. Within a dataset, the set of classes need not even be at
the same level of the class hierarchy tree. Our results demonstrate that
segmentation and classification metrics for the class set used by the test
split of a dataset can improve by pre-training on another dataset that may even
have a different set of classes due to the expansion of the training set
enabled by our method. Furthermore, generalization to previously unseen
datasets also improves by combining multiple other datasets with different sets
of classes for training. The improvement is both qualitative and quantitative.
The proposed method can be adapted for various loss functions, DNN
architectures, and application domains.
Related papers
- A data-centric approach for assessing progress of Graph Neural Networks [7.2249434861826325]
Graph Neural Networks (GNNs) have achieved state-of-the-art results in node classification tasks.
Most improvements are in multi-class classification, with less focus on the cases where each node could have multiple labels.
First challenge in studying multi-label node classification is the scarcity of publicly available datasets.
arXiv Detail & Related papers (2024-06-18T09:41:40Z) - UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell)
It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains.
In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z) - Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - Generating Hierarchical Structures for Improved Time Series
Classification Using Stochastic Splitting Functions [0.0]
This study introduces a novel hierarchical divisive clustering approach with splitting functions (SSFs) to enhance classification performance in multi-class datasets through hierarchical classification (HC)
The method has the unique capability of generating hierarchy without requiring explicit information, making it suitable for datasets lacking prior knowledge of hierarchy.
arXiv Detail & Related papers (2023-09-21T10:34:50Z) - Domain Adaptive Nuclei Instance Segmentation and Classification via
Category-aware Feature Alignment and Pseudo-labelling [65.40672505658213]
We propose a novel deep neural network, namely Category-Aware feature alignment and Pseudo-Labelling Network (CAPL-Net) for UDA nuclei instance segmentation and classification.
Our approach outperforms state-of-the-art UDA methods with a remarkable margin.
arXiv Detail & Related papers (2022-07-04T07:05:06Z) - Learning Semantic Segmentation from Multiple Datasets with Label Shifts [101.24334184653355]
This paper proposes UniSeg, an effective approach to automatically train models across multiple datasets with differing label spaces.
Specifically, we propose two losses that account for conflicting and co-occurring labels to achieve better generalization performance in unseen domains.
arXiv Detail & Related papers (2022-02-28T18:55:19Z) - Dominant Set-based Active Learning for Text Classification and its
Application to Online Social Media [0.0]
We present a novel pool-based active learning method for the training of large unlabeled corpus with minimum annotation cost.
Our proposed method does not have any parameters to be tuned, making it dataset-independent.
Our method achieves a higher performance in comparison to the state-of-the-art active learning strategies.
arXiv Detail & Related papers (2022-01-28T19:19:03Z) - A Topological Data Analysis Based Classifier [1.6668132748773563]
This paper proposes an algorithm that applies Topological Data Analysis directly to multi-class classification problems.
The proposed algorithm builds a filtered simplicial complex on the dataset.
On average, the proposed TDABC method was better than KNN and weighted-KNN.
arXiv Detail & Related papers (2021-11-09T15:54:16Z) - CvS: Classification via Segmentation For Small Datasets [52.821178654631254]
This paper presents CvS, a cost-effective classifier for small datasets that derives the classification labels from predicting the segmentation maps.
We evaluate the effectiveness of our framework on diverse problems showing that CvS is able to achieve much higher classification results compared to previous methods when given only a handful of examples.
arXiv Detail & Related papers (2021-10-29T18:41:15Z) - A Systematic Evaluation: Fine-Grained CNN vs. Traditional CNN
Classifiers [54.996358399108566]
We investigate the performance of the landmark general CNN classifiers, which presented top-notch results on large scale classification datasets.
We compare it against state-of-the-art fine-grained classifiers.
We show an extensive evaluation on six datasets to determine whether the fine-grained classifier is able to elevate the baseline in their experiments.
arXiv Detail & Related papers (2020-03-24T23:49:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.