CODE-AE: A Coherent De-confounding Autoencoder for Predicting
Patient-Specific Drug Response From Cell Line Transcriptomics
- URL: http://arxiv.org/abs/2102.00538v1
- Date: Sun, 31 Jan 2021 21:17:44 GMT
- Title: CODE-AE: A Coherent De-confounding Autoencoder for Predicting
Patient-Specific Drug Response From Cell Line Transcriptomics
- Authors: Di He, Lei Xie
- Abstract summary: We develop a Coherent Deconfounding Autoencoder (CODE-AE) that can extract both common biological signals shared by incoherent samples and private representations unique to each data set.
CODE-AE significantly improves the accuracy and robustness over state-of-the-art methods in both predicting patient drug response and de-confounding biological signals.
- Score: 35.67979269269178
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate and robust prediction of patient's response to drug treatments is
critical for developing precision medicine. However, it is often difficult to
obtain a sufficient amount of coherent drug response data from patients
directly for training a generalized machine learning model. Although the
utilization of rich cell line data provides an alternative solution, it is
challenging to transfer the knowledge obtained from cell lines to patients due
to various confounding factors. Few existing transfer learning methods can
reliably disentangle common intrinsic biological signals from confounding
factors in the cell line and patient data. In this paper, we develop a Coherent
Deconfounding Autoencoder (CODE-AE) that can extract both common biological
signals shared by incoherent samples and private representations unique to each
data set, transfer knowledge learned from cell line data to tissue data, and
separate confounding factors from them. Extensive studies on multiple data sets
demonstrate that CODE-AE significantly improves the accuracy and robustness
over state-of-the-art methods in both predicting patient drug response and
de-confounding biological signals. Thus, CODE-AE provides a useful framework to
take advantage of in vitro omics data for developing generalized patient
predictive models. The source code is available at
https://github.com/XieResearchGroup/CODE-AE.
Related papers
- Synthetic Data from Diffusion Models Improve Drug Discovery Prediction [1.3686993145787065]
Data sparsity makes data curation difficult for researchers looking to answer key research questions.
We propose a novel diffusion GNN model Syngand capable of generating ligand and pharmacokinetic data end-to-end.
We show initial promising results on the efficacy of the Syngand-generated synthetic target property data on downstream regression tasks with AqSolDB, LD50, and hERG central.
arXiv Detail & Related papers (2024-05-06T19:09:37Z) - Unlocking the Power of Multi-institutional Data: Integrating and Harmonizing Genomic Data Across Institutions [3.5489676012585236]
We introduce the Bridge model to derive integrated features to preserve information beyond common genes.
The model consistently excels in predicting patient survival across six cancer types in GENIE BPC data.
arXiv Detail & Related papers (2024-01-30T23:25:05Z) - Building Flexible, Scalable, and Machine Learning-ready Multimodal
Oncology Datasets [17.774341783844026]
This work proposes Multimodal Integration of Oncology Data System (MINDS)
MINDS is a flexible, scalable, and cost-effective metadata framework for efficiently fusing disparate data from public sources.
By harmonizing multimodal data, MINDS aims to potentially empower researchers with greater analytical ability.
arXiv Detail & Related papers (2023-09-30T15:44:39Z) - AI Framework for Early Diagnosis of Coronary Artery Disease: An
Integration of Borderline SMOTE, Autoencoders and Convolutional Neural
Networks Approach [0.44998333629984877]
We develop a methodology for balancing and augmenting data for more accurate prediction when the data is imbalanced and the sample size is small.
The experimental results revealed that the average accuracy of our proposed method for CAD prediction was 95.36, and was higher than random forest (RF), decision tree (DT), support vector machine (SVM), logistic regression (LR), and artificial neural network (ANN)
arXiv Detail & Related papers (2023-08-29T14:33:38Z) - Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks.
We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z) - Fault Diagnosis using eXplainable AI: a Transfer Learning-based Approach
for Rotating Machinery exploiting Augmented Synthetic Data [0.0]
FaultD-XAI is a generic and interpretable approach for classifying faults in rotating machinery based on transfer learning.
To provide scalability using transfer learning, synthetic vibration signals are created mimicking the characteristic behavior of failures in operation.
The proposed approach not only obtained promising diagnostic performance, but was also able to learn characteristics used by experts to identify conditions.
arXiv Detail & Related papers (2022-10-06T15:02:35Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z) - DeepEnroll: Patient-Trial Matching with Deep Embedding and Entailment
Prediction [67.91606509226132]
Clinical trials are essential for drug development but often suffer from expensive, inaccurate and insufficient patient recruitment.
DeepEnroll is a cross-modal inference learning model to jointly encode enrollment criteria (tabular data) into a shared latent space for matching inference.
arXiv Detail & Related papers (2020-01-22T17:51:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.