NucFuseRank: Dataset Fusion and Performance Ranking for Nuclei Instance Segmentation
- URL: http://arxiv.org/abs/2601.20104v1
- Date: Tue, 27 Jan 2026 22:45:48 GMT
- Title: NucFuseRank: Dataset Fusion and Performance Ranking for Nuclei Instance Segmentation
- Authors: Nima Torbati, Anastasia Meshcheryakova, Ramona Woitek, Sepideh Hatamikia, Diana Mechtcheriakova, Amirreza Mahbod,
- Abstract summary: nuclei instance segmentation in hematoxylin and eosin (H&E)-stained images plays an important role in automated histological image analysis.<n>Most research in this field focuses on developing new segmentation algorithms and benchmarking them on a limited number of arbitrarily selected public datasets.<n>We identified manually annotated, publicly available datasets of H&E-stained images for nuclei instance segmentation and standardized them into a unified input and annotation format.
- Score: 0.9453555561427657
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Nuclei instance segmentation in hematoxylin and eosin (H&E)-stained images plays an important role in automated histological image analysis, with various applications in downstream tasks. While several machine learning and deep learning approaches have been proposed for nuclei instance segmentation, most research in this field focuses on developing new segmentation algorithms and benchmarking them on a limited number of arbitrarily selected public datasets. In this work, rather than focusing on model development, we focused on the datasets used for this task. Based on an extensive literature review, we identified manually annotated, publicly available datasets of H&E-stained images for nuclei instance segmentation and standardized them into a unified input and annotation format. Using two state-of-the-art segmentation models, one based on convolutional neural networks (CNNs) and one based on a hybrid CNN and vision transformer architecture, we systematically evaluated and ranked these datasets based on their nuclei instance segmentation performance. Furthermore, we proposed a unified test set (NucFuse-test) for fair cross-dataset evaluation and a unified training set (NucFuse-train) for improved segmentation performance by merging images from multiple datasets. By evaluating and ranking the datasets, performing comprehensive analyses, generating fused datasets, conducting external validation, and making our implementation publicly available, we provided a new benchmark for training, testing, and evaluating nuclei instance segmentation models on H&E-stained histological images.
Related papers
- Towards Ground-truth-free Evaluation of Any Segmentation in Medical Images [22.36128130052757]
We build a ground-truth-free evaluation model to assess the quality of segmentations generated by the Segment Anything Model (SAM) and its variants in medical imaging.
This evaluation model estimates segmentation quality scores by analyzing the coherence and consistency between the input images and their corresponding segmentation predictions.
arXiv Detail & Related papers (2024-09-23T10:12:08Z) - FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models [28.44922164328789]
evaluation of text-to-image generative models is one essential step in the development process.
We propose FlashEval, an iterative search algorithm tailored to evaluation data selection.
Our searched 50-item subset could achieve comparable evaluation quality to the randomly sampled 500-item subset for COCO annotations.
arXiv Detail & Related papers (2024-03-25T02:53:32Z) - Few-Shot Learning for Annotation-Efficient Nucleus Instance Segmentation [50.407071700154674]
We propose to formulate annotation-efficient nucleus instance segmentation from the perspective of few-shot learning (FSL)
Our work was motivated by that, with the prosperity of computational pathology, an increasing number of fully-annotated datasets are publicly accessible.
Extensive experiments on a couple of publicly accessible datasets demonstrate that SGFSIS can outperform other annotation-efficient learning baselines.
arXiv Detail & Related papers (2024-02-26T03:49:18Z) - Diffusion-based Data Augmentation for Nuclei Image Segmentation [68.28350341833526]
We introduce the first diffusion-based augmentation method for nuclei segmentation.
The idea is to synthesize a large number of labeled images to facilitate training the segmentation model.
The experimental results show that by augmenting 10% labeled real dataset with synthetic samples, one can achieve comparable segmentation results.
arXiv Detail & Related papers (2023-10-22T06:16:16Z) - Improving Generalization Capability of Deep Learning-Based Nuclei
Instance Segmentation by Non-deterministic Train Time and Deterministic Test
Time Stain Normalization [0.674572634849505]
nuclei instance segmentation plays a fundamental role in a wide range of clinical and research applications.
Deep learning (DL)-based approaches have been shown to deliver the best performances.
We propose a novel method to improve the generalization capability of a DL-based automatic segmentation approach.
arXiv Detail & Related papers (2023-09-12T11:29:35Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - Which Pixel to Annotate: a Label-Efficient Nuclei Segmentation Framework [70.18084425770091]
Deep neural networks have been widely applied in nuclei instance segmentation of H&E stained pathology images.
It is inefficient and unnecessary to label all pixels for a dataset of nuclei images which usually contain similar and redundant patterns.
We propose a novel full nuclei segmentation framework that chooses only a few image patches to be annotated, augments the training set from the selected samples, and achieves nuclei segmentation in a semi-supervised manner.
arXiv Detail & Related papers (2022-12-20T14:53:26Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Domain Adaptive Nuclei Instance Segmentation and Classification via
Category-aware Feature Alignment and Pseudo-labelling [65.40672505658213]
We propose a novel deep neural network, namely Category-Aware feature alignment and Pseudo-Labelling Network (CAPL-Net) for UDA nuclei instance segmentation and classification.
Our approach outperforms state-of-the-art UDA methods with a remarkable margin.
arXiv Detail & Related papers (2022-07-04T07:05:06Z) - Instance Segmentation of Unlabeled Modalities via Cyclic Segmentation
GAN [27.936725483892076]
We propose a novel Cyclic Generative Adrial Network (CySGAN) that conducts image translation and instance segmentation jointly.
We benchmark our approach on the task of 3D neuronal nuclei segmentation with annotated electron microscopy (EM) images and unlabeled expansion microscopy (ExM) data.
arXiv Detail & Related papers (2022-04-06T20:46:39Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.