Masked adversarial neural network for cell type deconvolution in spatial transcriptomics
- URL: http://arxiv.org/abs/2408.05065v1
- Date: Fri, 9 Aug 2024 13:46:28 GMT
- Title: Masked adversarial neural network for cell type deconvolution in spatial transcriptomics
- Authors: Lin Huang, Xiaofei Liu, Shunfang Wang, Wenwen Min,
- Abstract summary: We propose a Masked Adversarial Neural Network (MACD) to align real ST data with simulated ST data generated from scRNA-seq data.
We demonstrate its accuracy in performing cell type deconvolution on 32 simulated datasets and 2 real datasets.
- Score: 5.1141169336435945
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurately determining cell type composition in disease-relevant tissues is crucial for identifying disease targets. Most existing spatial transcriptomics (ST) technologies cannot achieve single-cell resolution, making it challenging to accurately determine cell types. To address this issue, various deconvolution methods have been developed. Most of these methods use single-cell RNA sequencing (scRNA-seq) data from the same tissue as a reference to infer cell types in ST data spots. However, they often overlook the differences between scRNA-seq and ST data. To overcome this limitation, we propose a Masked Adversarial Neural Network (MACD). MACD employs adversarial learning to align real ST data with simulated ST data generated from scRNA-seq data. By mapping them into a unified latent space, it can minimize the differences between the two types of data. Additionally, MACD uses masking techniques to effectively learn the features of real ST data and mitigate noise. We evaluated MACD on 32 simulated datasets and 2 real datasets, demonstrating its accuracy in performing cell type deconvolution. All code and public datasets used in this paper are available at https://github.com/wenwenmin/MACD and https://zenodo.org/records/12804822.
Related papers
- Masked Graph Autoencoders with Contrastive Augmentation for Spatially Resolved Transcriptomics Data [1.3124513975412255]
We propose a Contrastively Augmented Masked Graph Autoencoder (STMGAC) to learn low-dimensional latent representations for domain identification.
In the latent space, persistent signals for representations are obtained through self-distillation to guide self-supervised matching.
We evaluate the performance of STMGAC on five datasets, achieving results superior to those of existing baseline methods.
arXiv Detail & Related papers (2024-08-09T02:49:23Z) - Multi-Source and Test-Time Domain Adaptation on Multivariate Signals using Spatio-Temporal Monge Alignment [59.75420353684495]
Machine learning applications on signals such as computer vision or biomedical data often face challenges due to the variability that exists across hardware devices or session recordings.
In this work, we propose Spatio-Temporal Monge Alignment (STMA) to mitigate these variabilities.
We show that STMA leads to significant and consistent performance gains between datasets acquired with very different settings.
arXiv Detail & Related papers (2024-07-19T13:33:38Z) - Binary Quantification and Dataset Shift: An Experimental Investigation [54.14283123210872]
Quantification is the supervised learning task that consists of training predictors of the class prevalence values of sets of unlabelled data.
The relationship between quantification and other types of dataset shift remains, by and large, unexplored.
We propose a fine-grained taxonomy of types of dataset shift, by establishing protocols for the generation of datasets affected by these types of shift.
arXiv Detail & Related papers (2023-10-06T20:11:27Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Convolutional Monge Mapping Normalization for learning on sleep data [63.22081662149488]
We propose a new method called Convolutional Monge Mapping Normalization (CMMN)
CMMN consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data.
Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture.
arXiv Detail & Related papers (2023-05-30T08:24:01Z) - Optirank: classification for RNA-Seq data with optimal ranking reference
genes [0.0]
We propose a logistic regression model, optirank, which learns simultaneously the parameters of the model and the genes to use as a reference set in the ranking.
We also consider real classification tasks, which present different kinds of distribution shifts between train and test data.
arXiv Detail & Related papers (2023-01-11T10:49:06Z) - Evaluating COVID-19 Sequence Data Using Nearest-Neighbors Based Network
Model [0.0]
SARS-CoV-2 coronavirus is the cause of the COVID-19 disease in humans.
It can adapt to different hosts and evolve into different lineages.
It is well-known that the major SARS-CoV-2 lineages are characterized by mutations that happen predominantly in the spike protein.
arXiv Detail & Related papers (2022-11-19T00:34:02Z) - MURAL: An Unsupervised Random Forest-Based Embedding for Electronic
Health Record Data [59.26381272149325]
We present an unsupervised random forest for representing data with disparate variable types.
MURAL forests consist of a set of decision trees where node-splitting variables are chosen at random.
We show that using our approach, we can visualize and classify data more accurately than competing approaches.
arXiv Detail & Related papers (2021-11-19T22:02:21Z) - Target-Independent Domain Adaptation for WBC Classification using
Generative Latent Search [20.199195698983715]
Unsupervised Domain Adaptation (UDA) techniques presuppose the existence of sufficient amount of unlabelled target data.
We propose a method for UDA that is devoid of the need for target data.
We prove the existence of such a clone given that infinite number of data points can be sampled from the source distribution.
arXiv Detail & Related papers (2020-05-11T20:58:23Z) - Cell Type Identification from Single-Cell Transcriptomic Data via
Semi-supervised Learning [2.4271601178529063]
Cell type identification from single-cell transcriptomic data is a common goal of single-cell RNA sequencing (scRNAseq) data analysis.
We propose a semi-supervised learning model to use unlabeled scRNAseq cells and limited amount of labeled scRNAseq cells to implement cell identification.
It is observed that the proposed model is able to achieve encouraging performance by learning on very limited amount of labeled scRNAseq cells.
arXiv Detail & Related papers (2020-05-06T19:15:43Z) - MSC: A Dataset for Macro-Management in StarCraft II [52.52008929278214]
We release a new macro-management dataset based on the platform SC2LE.
MSC consists of well-designed feature vectors, pre-defined high-level actions and final result of each match.
Besides the dataset, we propose a baseline model and present initial baseline results for global state evaluation and build order prediction.
arXiv Detail & Related papers (2017-10-09T14:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.