Causal integration of chemical structures improves representations of microscopy images for morphological profiling
- URL: http://arxiv.org/abs/2504.09544v2
- Date: Wed, 16 Apr 2025 19:03:34 GMT
- Title: Causal integration of chemical structures improves representations of microscopy images for morphological profiling
- Authors: Yemin Yu, Neil Tenenholtz, Lester Mackey, Ying Wei, David Alvarez-Melis, Ava P. Amini, Alex X. Lu,
- Abstract summary: We introduce a representation learning framework, MICON, that models chemical compounds as treatments that induce counterfactual transformations of cell phenotypes.<n>We demonstrate that incorporating chemical compound information into the learning process provides consistent improvements in our evaluation setting.<n>Our findings point to a new direction for representation learning in morphological profiling, suggesting that methods should explicitly account for the multimodal nature of microscopy screening data.
- Score: 25.027684911103897
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in self-supervised deep learning have improved our ability to quantify cellular morphological changes in high-throughput microscopy screens, a process known as morphological profiling. However, most current methods only learn from images, despite many screens being inherently multimodal, as they involve both a chemical or genetic perturbation as well as an image-based readout. We hypothesized that incorporating chemical compound structure during self-supervised pre-training could improve learned representations of images in high-throughput microscopy screens. We introduce a representation learning framework, MICON (Molecular-Image Contrastive Learning), that models chemical compounds as treatments that induce counterfactual transformations of cell phenotypes. MICON significantly outperforms classical hand-crafted features such as CellProfiler and existing deep-learning-based representation learning methods in challenging evaluation settings where models must identify reproducible effects of drugs across independent replicates and data-generating centers. We demonstrate that incorporating chemical compound information into the learning process provides consistent improvements in our evaluation setting and that modeling compounds specifically as treatments in a causal framework outperforms approaches that directly align images and compounds in a single representation space. Our findings point to a new direction for representation learning in morphological profiling, suggesting that methods should explicitly account for the multimodal nature of microscopy screening data.
Related papers
- PhenoProfiler: Advancing Phenotypic Learning for Image-based Drug Discovery [22.153859584729133]
PhenoProfiler is an end-to-end tool that processes whole-slide multi-channel images directly into low-dimensional quantitative representations.
It is rigorously evaluated on large-scale publicly available datasets.
PhenoProfiler consistently outperforms state-of-the-art methods by up to 20%.
arXiv Detail & Related papers (2025-02-26T21:20:43Z) - ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy [3.432992120614117]
We present the largest foundation model for cell microscopy data to date.
Compared to a previous published ViT-L/8 MAE, our new model achieves a 60% improvement in linear separability of genetic perturbations.
arXiv Detail & Related papers (2024-11-04T20:09:51Z) - Pre-trained Molecular Language Models with Random Functional Group Masking [54.900360309677794]
We propose a SMILES-based underlineem Molecular underlineem Language underlineem Model, which randomly masking SMILES subsequences corresponding to specific molecular atoms.
This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities.
arXiv Detail & Related papers (2024-11-03T01:56:15Z) - Bayesian Unsupervised Disentanglement of Anatomy and Geometry for Deep Groupwise Image Registration [50.62725807357586]
This article presents a general Bayesian learning framework for multi-modal groupwise image registration.
We propose a novel hierarchical variational auto-encoding architecture to realise the inference procedure of the latent variables.
Experiments were conducted to validate the proposed framework, including four different datasets from cardiac, brain, and abdominal medical images.
arXiv Detail & Related papers (2024-01-04T08:46:39Z) - Morphological Profiling for Drug Discovery in the Era of Deep Learning [13.307277432389496]
We provide a comprehensive overview of the recent advances in the field of morphological profiling.
We place a particular emphasis on the application of deep learning in this pipeline.
arXiv Detail & Related papers (2023-12-13T05:08:32Z) - Bridging Synthetic and Real Images: a Transferable and Multiple
Consistency aided Fundus Image Enhancement Framework [61.74188977009786]
We propose an end-to-end optimized teacher-student framework to simultaneously conduct image enhancement and domain adaptation.
We also propose a novel multi-stage multi-attention guided enhancement network (MAGE-Net) as the backbones of our teacher and student network.
arXiv Detail & Related papers (2023-02-23T06:16:15Z) - Semantic Image Segmentation with Deep Learning for Vine Leaf Phenotyping [59.0626764544669]
In this study, we use Deep Learning methods to semantically segment grapevine leaves images in order to develop an automated object detection system for leaf phenotyping.
Our work contributes to plant lifecycle monitoring through which dynamic traits such as growth and development can be captured and quantified.
arXiv Detail & Related papers (2022-10-24T14:37:09Z) - Learning multi-scale functional representations of proteins from
single-cell microscopy data [77.34726150561087]
We show that simple convolutional networks trained on localization classification can learn protein representations that encapsulate diverse functional information.
We also propose a robust evaluation strategy to assess quality of protein representations across different scales of biological function.
arXiv Detail & Related papers (2022-05-24T00:00:07Z) - Defocus Deblur Microscopy via Head-to-Tail Cross-scale Fusion [0.0]
We develop a structure of multi-scale U-Net without cascade residual leaning.
In contrast to the conventional coarse-to-fine model, our model strengthens the cross-scale interaction.
Our method yields better performance when compared with other existing models.
arXiv Detail & Related papers (2022-01-08T18:53:54Z) - Deep Low-Shot Learning for Biological Image Classification and
Visualization from Limited Training Samples [52.549928980694695]
In situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared.
labeling training data with precise stages is very time-consuming even for biologists.
We propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images.
arXiv Detail & Related papers (2020-10-20T06:06:06Z) - Improved Conditional Flow Models for Molecule to Image Synthesis [37.886816307827196]
Mol2Image is a flow-based generative model for molecule to cell image synthesis.
To generate cell features at different resolutions and scale to high-resolution images, we develop a novel multi-scale flow architecture.
To maximize the mutual information between the generated images and the molecular interventions, we devise a training strategy based on contrastive learning.
arXiv Detail & Related papers (2020-06-15T16:39:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.