CellFMCount: A Fluorescence Microscopy Dataset, Benchmark, and Methods for Cell Counting
- URL: http://arxiv.org/abs/2511.19351v1
- Date: Mon, 24 Nov 2025 17:53:59 GMT
- Title: CellFMCount: A Fluorescence Microscopy Dataset, Benchmark, and Methods for Cell Counting
- Authors: Abdurahman Ali Mohammed, Catherine Fonder, Ying Wei, Wallapak Tavanapong, Donald S Sakaguchi, Qi Li, Surya K. Mallapragada,
- Abstract summary: We introduce a large-scale annotated dataset comprising $3,023$ images from immunocytochemistry experiments related to cellular differentiation.<n>The dataset presents significant challenges: high cell density, overlapping and morphologically diverse cells, a long-tailed distribution of cell count per image, and variation in staining protocols.<n>We implement a density-map-based adaptation of the Segment Anything Model (SAM) and report a mean absolute error (MAE) of $22.12$, which outperforms existing approaches.
- Score: 10.191134773938566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate cell counting is essential in various biomedical research and clinical applications, including cancer diagnosis, stem cell research, and immunology. Manual counting is labor-intensive and error-prone, motivating automation through deep learning techniques. However, training reliable deep learning models requires large amounts of high-quality annotated data, which is difficult and time-consuming to produce manually. Consequently, existing cell-counting datasets are often limited, frequently containing fewer than $500$ images. In this work, we introduce a large-scale annotated dataset comprising $3{,}023$ images from immunocytochemistry experiments related to cellular differentiation, containing over $430{,}000$ manually annotated cell locations. The dataset presents significant challenges: high cell density, overlapping and morphologically diverse cells, a long-tailed distribution of cell count per image, and variation in staining protocols. We benchmark three categories of existing methods: regression-based, crowd-counting, and cell-counting techniques on a test set with cell counts ranging from $10$ to $2{,}126$ cells per image. We also evaluate how the Segment Anything Model (SAM) can be adapted for microscopy cell counting using only dot-annotated datasets. As a case study, we implement a density-map-based adaptation of SAM (SAM-Counter) and report a mean absolute error (MAE) of $22.12$, which outperforms existing approaches (second-best MAE of $27.46$). Our results underscore the value of the dataset and the benchmarking framework for driving progress in automated cell counting and provide a robust foundation for future research and development.
Related papers
- CellVerse: Do Large Language Models Really Understand Cell Biology? [74.34984441715517]
We introduce CellVerse, a unified language-centric question-answering benchmark that integrates four types of single-cell multi-omics data.<n>We systematically evaluate the performance across 14 open-source and closed-source LLMs ranging from 160M to 671B on CellVerse.
arXiv Detail & Related papers (2025-05-09T06:47:23Z) - Cell as Point: One-Stage Framework for Efficient Cell Tracking [56.98130311648714]
We propose CAP, a novel end-to-end one-stage framework that reimagines cell tracking by treating Cell as Point.<n>Unlike traditional methods, CAP eliminates the need for explicit detection or segmentation, instead jointly tracking cells for sequences in one stage by leveraging the inherent correlations among their trajectories.<n>Cap demonstrates promising cell tracking performance and is 8 to 32 times more efficient than existing methods.
arXiv Detail & Related papers (2024-11-22T10:16:35Z) - IDCIA: Immunocytochemistry Dataset for Cellular Image Analysis [0.5057850174013127]
We present a new annotated microscopic cellular image dataset to improve the effectiveness of machine learning methods for cellular image analysis.
Our dataset includes microscopic images of cells, and for each image, the cell count and the location of individual cells.
arXiv Detail & Related papers (2024-11-13T19:33:08Z) - MMIL: A novel algorithm for disease associated cell type discovery [58.044870442206914]
Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease.
We introduce Mixture Modeling for Multiple Learning Instance (MMIL), an expectation method that enables the training and calibration of cell-level classifiers.
arXiv Detail & Related papers (2024-06-12T15:22:56Z) - UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell)
It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains.
In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z) - Single-cell Multi-view Clustering via Community Detection with Unknown
Number of Clusters [64.31109141089598]
We introduce scUNC, an innovative multi-view clustering approach tailored for single-cell data.
scUNC seamlessly integrates information from different views without the need for a predefined number of clusters.
We conducted a comprehensive evaluation of scUNC using three distinct single-cell datasets.
arXiv Detail & Related papers (2023-11-28T08:34:58Z) - The TYC Dataset for Understanding Instance-Level Semantics and Motions
of Cells in Microstructures [29.29348484938194]
trapped yeast cell (TYC) dataset is a novel dataset for understanding instance-level semantics and motions of cells in microstructures.
TYC offers ten times more instance annotations than the previously largest dataset, including cells and microstructures.
arXiv Detail & Related papers (2023-08-23T13:10:33Z) - VOLTA: an Environment-Aware Contrastive Cell Representation Learning for
Histopathology [0.3436781233454516]
We propose a self-supervised framework (VOLTA) for cell representation learning in histopathology images.
We subjected our model to extensive experiments on the data collected from multiple institutions around the world.
To showcase the potential power of our proposed framework, we applied VOLTA to ovarian and endometrial cancers with very small sample sizes.
arXiv Detail & Related papers (2023-03-08T16:35:47Z) - Deep neural networks approach to microbial colony detection -- a
comparative analysis [52.77024349608834]
This study investigates the performance of three deep learning approaches for object detection on the AGAR dataset.
The achieved results may serve as a benchmark for future experiments.
arXiv Detail & Related papers (2021-08-23T12:06:00Z) - Classification Beats Regression: Counting of Cells from Greyscale
Microscopic Images based on Annotation-free Training Samples [20.91256120719461]
This work proposes a supervised learning framework to count cells from greyscale microscopic images without using annotated training images.
We formulate the cell counting task as an image classification problem, where the cell counts are taken as class labels.
To deal with these limitations, we propose a simple but effective data augmentation (DA) method to synthesize images for the unseen cell counts.
arXiv Detail & Related papers (2020-10-28T06:19:30Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Learning to segment clustered amoeboid cells from brightfield microscopy
via multi-task learning with adaptive weight selection [6.836162272841265]
We introduce a novel supervised technique for cell segmentation in a multi-task learning paradigm.
A combination of a multi-task loss, based on the region and cell boundary detection, is employed for an improved prediction efficiency of the network.
We observe an overall Dice score of 0.93 on the validation set, which is an improvement of over 15.9% on a recent unsupervised method, and outperforms the popular supervised U-net algorithm by at least $5.8%$ on average.
arXiv Detail & Related papers (2020-05-19T11:31:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.