Related papers: CellCLIP -- Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning

CellCLIP -- Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning

URL: http://arxiv.org/abs/2506.06290v2
Date: Tue, 17 Jun 2025 04:58:45 GMT
Title: CellCLIP -- Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning
Authors: Mingyu Lu, Ethan Weinberger, Chanwoo Kim, Su-In Lee,
Abstract summary: We introduce CellCLIP, a cross-modal contrastive learning framework for HCS data.<n>Our framework outperforms current open-source models, demonstrating the best performance in both cross-modal retrieval and biologically meaningful downstream tasks.
Score: 13.948734618526151
License: http://creativecommons.org/licenses/by/4.0/
Abstract: High-content screening (HCS) assays based on high-throughput microscopy techniques such as Cell Painting have enabled the interrogation of cells' morphological responses to perturbations at an unprecedented scale. The collection of such data promises to facilitate a better understanding of the relationships between different perturbations and their effects on cellular state. Towards achieving this goal, recent advances in cross-modal contrastive learning could, in theory, be leveraged to learn a unified latent space that aligns perturbations with their corresponding morphological effects. However, the application of such methods to HCS data is not straightforward due to substantial differences in the semantics of Cell Painting images compared to natural images, and the difficulty of representing different classes of perturbations (e.g., small molecule vs CRISPR gene knockout) in a single latent space. In response to these challenges, here we introduce CellCLIP, a cross-modal contrastive learning framework for HCS data. CellCLIP leverages pre-trained image encoders coupled with a novel channel encoding scheme to better capture relationships between different microscopy channels in image embeddings, along with natural language encoders for representing perturbations. Our framework outperforms current open-source models, demonstrating the best performance in both cross-modal retrieval and biologically meaningful downstream tasks while also achieving significant reductions in computation time.

Related papers

PixCell: A generative foundation model for digital histopathology images [49.00921097924924]
We introduce PixCell, the first diffusion-based generative foundation model for histopathology.<n>We train PixCell on PanCan-30M, a vast, diverse dataset derived from 69,184 H&E-stained whole slide images covering various cancer types.
arXiv Detail & Related papers (2025-06-05T15:14:32Z)
Interpretable deep learning illuminates multiple structures fluorescence imaging: a path toward trustworthy artificial intelligence in microscopy [10.395551533758358]
We present the Adaptive Explainable Multi-Structure Network (AEMS-Net), a deep-learning framework that enables simultaneous prediction of two subcellular structures from a single image.<n>We demonstrate that AEMS-Net allows real-time recording of interactions between mitochondria and microtubules, requiring only half the conventional sequential-channel imaging procedures.
arXiv Detail & Related papers (2025-01-09T07:36:28Z)
Mitigating Hallucination for Large Vision Language Model by Inter-Modality Correlation Calibration Decoding [66.06337890279839]
Large vision-language models (LVLMs) have shown remarkable capabilities in visual-language understanding for downstream multi-modal tasks.<n>LVLMs still suffer from generating hallucinations in complex generation tasks, leading to inconsistencies between visual inputs and generated content.<n>We propose an Inter-Modality Correlation Decoding (IMCCD) method to mitigate hallucinations in LVLMs in a training-free manner.
arXiv Detail & Related papers (2025-01-03T17:56:28Z)
Multi-modal Spatial Clustering for Spatial Transcriptomics Utilizing High-resolution Histology Images [1.3124513975412255]
spatial transcriptomics (ST) enables transcriptome-wide gene expression profiling while preserving spatial context. Current spatial clustering methods fail to fully integrate high-resolution histology image features with gene expression data. We propose a novel contrastive learning-based deep learning approach that integrates gene expression data with histology image features.
arXiv Detail & Related papers (2024-10-31T00:32:24Z)
Weakly Supervised Set-Consistency Learning Improves Morphological Profiling of Single-Cell Images [0.6491172192043603]
We propose a set-level consistency learning algorithm, Set-DINO, to improve learned representations of perturbation effects in single-cell images. We conduct experiments on a large-scale Optical Pooled Screening dataset with more than 5000 genetic perturbations.
arXiv Detail & Related papers (2024-06-08T00:53:30Z)
Practical Guidelines for Cell Segmentation Models Under Optical Aberrations in Microscopy [14.042884268397058]
This study evaluates cell image segmentation models under optical aberrations from fluorescence and bright field microscopy. We train and test several segmentation models, including the Otsu threshold method and Mask R-CNN with different network heads. In contrast, Cellpose 2.0 proves effective for complex cell images under similar conditions.
arXiv Detail & Related papers (2024-04-12T15:45:26Z)
Optimizations of Autoencoders for Analysis and Classification of Microscopic In Situ Hybridization Images [68.8204255655161]
We propose a deep-learning framework to detect and classify areas of microscopic images with similar levels of gene expression. The data we analyze requires an unsupervised learning model for which we employ a type of Artificial Neural Network - Deep Learning Autoencoders.
arXiv Detail & Related papers (2023-04-19T13:45:28Z)
AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images. AMIGO uses the celluar graph within the tissue to provide a single representation for a patient. We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z)
OADAT: Experimental and Synthetic Clinical Optoacoustic Data for Standardized Image Processing [62.993663757843464]
Optoacoustic (OA) imaging is based on excitation of biological tissues with nanosecond-duration laser pulses followed by detection of ultrasound waves generated via light-absorption-mediated thermoelastic expansion. OA imaging features a powerful combination between rich optical contrast and high resolution in deep tissues. No standardized datasets generated with different types of experimental set-up and associated processing methods are available to facilitate advances in broader applications of OA in clinical settings.
arXiv Detail & Related papers (2022-06-17T08:11:26Z)
CLAWS: Contrastive Learning with hard Attention and Weak Supervision [1.1619569706231647]
We present CLAWS, an annotation-efficient learning framework, addressing the problem of manually labeling large-scale agricultural datasets. CLAWS uses a network backbone inspired by SimCLR and weak supervision to investigate the effect of contrastive learning within class clusters. We compare results between a supervised SimCLR and CLAWS using an agricultural dataset with 227,060 samples consisting of 11 different crop classes.
arXiv Detail & Related papers (2021-12-01T21:45:58Z)
Pathological Retinal Region Segmentation From OCT Images Using Geometric Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape. The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.