Weakly Supervised Set-Consistency Learning Improves Morphological Profiling of Single-Cell Images
- URL: http://arxiv.org/abs/2406.05308v1
- Date: Sat, 8 Jun 2024 00:53:30 GMT
- Title: Weakly Supervised Set-Consistency Learning Improves Morphological Profiling of Single-Cell Images
- Authors: Heming Yao, Phil Hanslovsky, Jan-Christian Huetter, Burkhard Hoeckendorf, David Richmond,
- Abstract summary: We propose a set-level consistency learning algorithm, Set-DINO, to improve learned representations of perturbation effects in single-cell images.
We conduct experiments on a large-scale Optical Pooled Screening dataset with more than 5000 genetic perturbations.
- Score: 0.6491172192043603
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optical Pooled Screening (OPS) is a powerful tool combining high-content microscopy with genetic engineering to investigate gene function in disease. The characterization of high-content images remains an active area of research and is currently undergoing rapid innovation through the application of self-supervised learning and vision transformers. In this study, we propose a set-level consistency learning algorithm, Set-DINO, that combines self-supervised learning with weak supervision to improve learned representations of perturbation effects in single-cell images. Our method leverages the replicate structure of OPS experiments (i.e., cells undergoing the same genetic perturbation, both within and across batches) as a form of weak supervision. We conduct extensive experiments on a large-scale OPS dataset with more than 5000 genetic perturbations, and demonstrate that Set-DINO helps mitigate the impact of confounders and encodes more biologically meaningful information. In particular, Set-DINO recalls known biological relationships with higher accuracy compared to commonly used methods for morphological profiling, suggesting that it can generate more reliable insights from drug target discovery campaigns leveraging OPS.
Related papers
- RankByGene: Gene-Guided Histopathology Representation Learning Through Cross-Modal Ranking Consistency [11.813883157319381]
We propose a novel framework that aligns gene and image features using a ranking-based alignment loss.
To further enhance the alignment's stability, we employ self-supervised knowledge distillation with a teacher-student network architecture.
arXiv Detail & Related papers (2024-11-22T17:08:28Z) - Gene-Level Representation Learning via Interventional Style Transfer in Optical Pooled Screening [3.7038542578642715]
We employ a style-transfer approach to learn gene-level feature representations from images of genetically perturbed cells obtained via Optical pooled screening (OPS)
Our method outperforms widely used engineered features in clustering gene representations according to gene function, demonstrating its utility for uncovering latent biological relationships.
arXiv Detail & Related papers (2024-06-11T22:56:50Z) - Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology [2.7280901660033643]
This work explores the scaling properties of weakly supervised classifiers and self-supervised masked autoencoders (MAEs)
Our results show that ViT-based MAEs outperform weakly supervised classifiers on a variety of tasks, achieving as much as a 11.5% relative improvement when recalling known biological relationships curated from public databases.
We develop a new channel-agnostic MAE architecture (CA-MAE) that allows for inputting images of different numbers and orders of channels at inference time.
arXiv Detail & Related papers (2024-04-16T02:42:06Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Causal machine learning for single-cell genomics [94.28105176231739]
We discuss the application of machine learning techniques to single-cell genomics and their challenges.
We first present the model that underlies most of current causal approaches to single-cell biology.
We then identify open problems in the application of causal approaches to single-cell data.
arXiv Detail & Related papers (2023-10-23T13:35:24Z) - Genetic InfoMax: Exploring Mutual Information Maximization in
High-Dimensional Imaging Genetics Studies [50.11449968854487]
Genome-wide association studies (GWAS) are used to identify relationships between genetic variations and specific traits.
Representation learning for imaging genetics is largely under-explored due to the unique challenges posed by GWAS.
We introduce a trans-modal learning framework Genetic InfoMax (GIM) to address the specific challenges of GWAS.
arXiv Detail & Related papers (2023-09-26T03:59:21Z) - Unsupervised ensemble-based phenotyping helps enhance the
discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles.
It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner.
These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z) - Anchoring to Exemplars for Training Mixture-of-Expert Cell Embeddings [15.964908790433855]
We propose an embedding learning approach that learns a set of experts that are specialized in capturing technical variations in our training set and then aggregates specialist's predictions at test time.
We evaluate our approach on three datasets for tasks like drug discovery, boosting performance on identifying the true mechanism of action of cell treatments by 5.5-11% over the state-of-the-art.
arXiv Detail & Related papers (2021-12-06T18:05:49Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.