TransCDR: a deep learning model for enhancing the generalizability of
cancer drug response prediction through transfer learning and multimodal data
fusion for drug representation
- URL: http://arxiv.org/abs/2311.12040v1
- Date: Fri, 17 Nov 2023 14:55:12 GMT
- Title: TransCDR: a deep learning model for enhancing the generalizability of
cancer drug response prediction through transfer learning and multimodal data
fusion for drug representation
- Authors: Xiaoqiong Xia, Chaoyu Zhu, Yuqi Shan, Fan Zhong, and Lei Liu
- Abstract summary: We introduce TransCDR, which uses transfer learning to learn drug representations and fuses multi-modality features of drugs and cell lines by a self-attention mechanism.
We are the first to systematically evaluate the generalization of the CDR prediction model to novel (i.e., never-before-seen) compound scaffolds and cell line clusters.
TransCDR shows better generalizability than 8 state-of-the-art models.
- Score: 4.740134482580255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate and robust drug response prediction is of utmost importance in
precision medicine. Although many models have been developed to utilize the
representations of drugs and cancer cell lines for predicting cancer drug
responses (CDR), their performances can be improved by addressing issues such
as insufficient data modality, suboptimal fusion algorithms, and poor
generalizability for novel drugs or cell lines. We introduce TransCDR, which
uses transfer learning to learn drug representations and fuses multi-modality
features of drugs and cell lines by a self-attention mechanism, to predict the
IC50 values or sensitive states of drugs on cell lines. We are the first to
systematically evaluate the generalization of the CDR prediction model to novel
(i.e., never-before-seen) compound scaffolds and cell line clusters. TransCDR
shows better generalizability than 8 state-of-the-art models. TransCDR
outperforms its 5 variants that train drug encoders (i.e., RNN and AttentiveFP)
from scratch under various scenarios. The most critical contributors among
multiple drug notations and omics profiles are Extended Connectivity
Fingerprint and genetic mutation. Additionally, the attention-based fusion
module further enhances the predictive performance of TransCDR. TransCDR,
trained on the GDSC dataset, demonstrates strong predictive performance on the
external testing set CCLE. It is also utilized to predict missing CDRs on GDSC.
Moreover, we investigate the biological mechanisms underlying drug response by
classifying 7,675 patients from TCGA into drug-sensitive or drug-resistant
groups, followed by a Gene Set Enrichment Analysis. TransCDR emerges as a
potent tool with significant potential in drug response prediction. The source
code and data can be accessed at https://github.com/XiaoqiongXia/TransCDR.
Related papers
- Dumpling GNN: Hybrid GNN Enables Better ADC Payload Activity Prediction Based on Chemical Structure [53.76752789814785]
DumplingGNN is a hybrid Graph Neural Network architecture specifically designed for predicting ADC payload activity based on chemical structure.
We evaluate it on a comprehensive ADC payload dataset focusing on DNA Topoisomerase I inhibitors.
It demonstrates exceptional accuracy (91.48%), sensitivity (95.08%), and specificity (97.54%) on our specialized ADC payload dataset.
arXiv Detail & Related papers (2024-09-23T17:11:04Z) - Regressor-free Molecule Generation to Support Drug Response Prediction [83.25894107956735]
Conditional generation based on the target IC50 score can obtain a more effective sampling space.
Regressor-free guidance combines a diffusion model's score estimation with a regression controller model's gradient based on number labels.
arXiv Detail & Related papers (2024-05-23T13:22:17Z) - drGAT: Attention-Guided Gene Assessment of Drug Response Utilizing a Drug-Cell-Gene Heterogeneous Network [9.637695046701493]
drGAT is a graph deep learning model that can predict sensitivity to drugs.
drGAT has superior performance over existing models, achieving 78% accuracy and 76% F1 score for 269 DNA-damaging compounds.
Our method can be used to accurately predict sensitivity to drugs and may be useful in the identification of biomarkers relating to the treatment of cancer patients.
arXiv Detail & Related papers (2024-05-14T22:16:52Z) - Emerging Drug Interaction Prediction Enabled by Flow-based Graph Neural
Network with Biomedical Network [69.16939798838159]
We propose EmerGNN, a graph neural network (GNN) that can effectively predict interactions for emerging drugs.
EmerGNN learns pairwise representations of drugs by extracting the paths between drug pairs, propagating information from one drug to the other, and incorporating the relevant biomedical concepts on the paths.
Overall, EmerGNN has higher accuracy than existing approaches in predicting interactions for emerging drugs and can identify the most relevant information on the biomedical network.
arXiv Detail & Related papers (2023-11-15T06:34:00Z) - Precision Anti-Cancer Drug Selection via Neural Ranking [0.342658286826597]
We propose two neural listwise ranking methods that learn latent representations of drugs and cell lines, and then use those representations to score drugs in each cell line via a learnable scoring function.
Our results demonstrate that List-All outperforms the best baseline with significant improvements of as much as 8.6% in hit@20 across 50% test cell lines.
arXiv Detail & Related papers (2023-06-30T16:23:25Z) - T Cell Receptor Protein Sequences and Sparse Coding: A Novel Approach to
Cancer Classification [4.824821328103934]
T cell receptors (TCRs) are essential proteins for the adaptive immune system.
Recent advancements in sequencing technologies have enabled the comprehensive profiling of TCR repertoires.
This has led to the discovery of TCRs with potent anti-cancer activity and the development of TCR-based immunotherapies.
arXiv Detail & Related papers (2023-04-25T20:43:41Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - TCR: A Transformer Based Deep Network for Predicting Cancer Drugs
Response [12.86640026993276]
We proposeTransformer based network for Cancer drug Response (TCR) to predict anti-cancer drug response.
By utilizing an attention mechanism, TCR is able to learn the interactions between drug atom/sub-structure and molecular signatures efficiently.
Our study highlights the prediction power of TCR and its potential value for cancer drug repurpose and precision oncology treatment.
arXiv Detail & Related papers (2022-07-10T13:01:54Z) - CODE-AE: A Coherent De-confounding Autoencoder for Predicting
Patient-Specific Drug Response From Cell Line Transcriptomics [35.67979269269178]
We develop a Coherent Deconfounding Autoencoder (CODE-AE) that can extract both common biological signals shared by incoherent samples and private representations unique to each data set.
CODE-AE significantly improves the accuracy and robustness over state-of-the-art methods in both predicting patient drug response and de-confounding biological signals.
arXiv Detail & Related papers (2021-01-31T21:17:44Z) - Ensemble Transfer Learning for the Prediction of Anti-Cancer Drug
Response [49.86828302591469]
In this paper, we apply transfer learning to the prediction of anti-cancer drug response.
We apply the classic transfer learning framework that trains a prediction model on the source dataset and refines it on the target dataset.
The ensemble transfer learning pipeline is implemented using LightGBM and two deep neural network (DNN) models with different architectures.
arXiv Detail & Related papers (2020-05-13T20:29:48Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.