Integrating Biological Knowledge for Robust Microscopy Image Profiling on De Novo Cell Lines
- URL: http://arxiv.org/abs/2507.10737v1
- Date: Mon, 14 Jul 2025 19:01:06 GMT
- Title: Integrating Biological Knowledge for Robust Microscopy Image Profiling on De Novo Cell Lines
- Authors: Jiayuan Chen, Thai-Hoang Pham, Yuanlong Wang, Ping Zhang,
- Abstract summary: We propose a framework that integrates external biological knowledge into existing pretraining strategies to enhance microscopy image profiling models.<n>Our approach explicitly disentangles perturbation-specific and cell line-specific representations using external biological information.<n> Experimental results demonstrate that our method enhances microscopy image profiling for textitde novo cell lines.
- Score: 6.8917447861745735
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: High-throughput screening techniques, such as microscopy imaging of cellular responses to genetic and chemical perturbations, play a crucial role in drug discovery and biomedical research. However, robust perturbation screening for \textit{de novo} cell lines remains challenging due to the significant morphological and biological heterogeneity across cell lines. To address this, we propose a novel framework that integrates external biological knowledge into existing pretraining strategies to enhance microscopy image profiling models. Our approach explicitly disentangles perturbation-specific and cell line-specific representations using external biological information. Specifically, we construct a knowledge graph leveraging protein interaction data from STRING and Hetionet databases to guide models toward perturbation-specific features during pretraining. Additionally, we incorporate transcriptomic features from single-cell foundation models to capture cell line-specific representations. By learning these disentangled features, our method improves the generalization of imaging models to \textit{de novo} cell lines. We evaluate our framework on the RxRx database through one-shot fine-tuning on an RxRx1 cell line and few-shot fine-tuning on cell lines from the RxRx19a dataset. Experimental results demonstrate that our method enhances microscopy image profiling for \textit{de novo} cell lines, highlighting its effectiveness in real-world phenotype-based drug discovery applications.
Related papers
- PixCell: A generative foundation model for digital histopathology images [49.00921097924924]
We introduce PixCell, the first diffusion-based generative foundation model for histopathology.<n>We train PixCell on PanCan-30M, a vast, diverse dataset derived from 69,184 H&E-stained whole slide images covering various cancer types.
arXiv Detail & Related papers (2025-06-05T15:14:32Z) - Reconstructing Cell Lineage Trees from Phenotypic Features with Metric Learning [0.0]
A key approach to studying developmental processes is to infer the tree graph of cell lineage division and differentiation histories.<n>Here, we introduce CellTreeQM, a novel deep learning method that learns an embedding space with geometric properties optimized for tree-graph inference.
arXiv Detail & Related papers (2025-03-18T05:41:03Z) - HistoSmith: Single-Stage Histology Image-Label Generation via Conditional Latent Diffusion for Enhanced Cell Segmentation and Classification [0.19791587637442667]
This study introduces a novel single-stage approach for generating image-label pairs to augment histology datasets.<n>Unlike state-of-the-art methods that utilize diffusion models with separate components for label and image generation, our approach employs a latent diffusion model.<n>This model enables tailored data generation by conditioning on user-defined parameters such as cell types, quantities, and tissue types.
arXiv Detail & Related papers (2025-02-12T19:51:41Z) - DiffKillR: Killing and Recreating Diffeomorphisms for Cell Annotation in Dense Microscopy Images [105.46086313858062]
We introduce DiffKillR, a novel framework that reframes cell annotation as the combination of archetype matching and image registration tasks.<n>DiffKillR efficiently propagates annotations across large microscopy images, reducing the need for extensive manual labeling.<n>We will discuss the theoretical properties of DiffKillR and validate it on three microscopy tasks, demonstrating its advantages over existing supervised, semi-supervised, and unsupervised methods.
arXiv Detail & Related papers (2024-10-04T00:38:29Z) - Cell-ontology guided transcriptome foundation model [18.51941953027685]
We pre-trained scCello on 22 million cells from CellxGene database leveraging their cell-type labels mapped to the cell ontology graph from Open Biological and Biomedical Ontology Foundry.<n>Our TFM demonstrates competitive generalization and transferability performance over the existing TFMs on biologically important tasks.
arXiv Detail & Related papers (2024-08-22T13:15:49Z) - Multi-Modal and Multi-Attribute Generation of Single Cells with CFGen [76.02070962797794]
This work introduces CellFlow for Generation (CFGen), a flow-based conditional generative model that preserves the inherent discreteness of single-cell data.<n>CFGen generates whole-genome multi-modal single-cell data reliably, improving the recovery of crucial biological data characteristics.
arXiv Detail & Related papers (2024-07-16T14:05:03Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - VOLTA: an Environment-Aware Contrastive Cell Representation Learning for
Histopathology [0.3436781233454516]
We propose a self-supervised framework (VOLTA) for cell representation learning in histopathology images.
We subjected our model to extensive experiments on the data collected from multiple institutions around the world.
To showcase the potential power of our proposed framework, we applied VOLTA to ovarian and endometrial cancers with very small sample sizes.
arXiv Detail & Related papers (2023-03-08T16:35:47Z) - Learning multi-scale functional representations of proteins from
single-cell microscopy data [77.34726150561087]
We show that simple convolutional networks trained on localization classification can learn protein representations that encapsulate diverse functional information.
We also propose a robust evaluation strategy to assess quality of protein representations across different scales of biological function.
arXiv Detail & Related papers (2022-05-24T00:00:07Z) - Machine learning based lens-free imaging technique for field-portable
cytometry [0.0]
The performance of our proposed method shows an increase in accuracy >98% along with the signal enhancement of >5 dB for most of the cell types.
The model is adaptive to learn new type of samples within a few learning iterations and able to successfully classify the newly introduced sample.
arXiv Detail & Related papers (2022-03-02T07:09:29Z) - From augmented microscopy to the topological transformer: a new approach
in cell image analysis for Alzheimer's research [0.0]
Cell image analysis is crucial in Alzheimer's research to detect the presence of A$beta$ protein inhibiting cell function.
We first found Unet is most suitable in augmented microscopy by comparing performance in multi-class semantics segmentation.
We develop the augmented microscopy method to capture nuclei in a brightfield image and the transformer using Unet model to convert an input image into a sequence of topological information.
arXiv Detail & Related papers (2021-08-03T16:59:33Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.