YZS-model: A Predictive Model for Organic Drug Solubility Based on Graph Convolutional Networks and Transformer-Attention
- URL: http://arxiv.org/abs/2406.19136v4
- Date: Wed, 17 Jul 2024 13:35:31 GMT
- Title: YZS-model: A Predictive Model for Organic Drug Solubility Based on Graph Convolutional Networks and Transformer-Attention
- Authors: Chenxu Wang, Haowei Ming, Jian He, Yao Lu, Junhong Chen,
- Abstract summary: Traditional solubility prediction techniques often fail to capture the complex nature of molecular tructures, leading to notable deviations between predictions and actual results.
This research introduces a novel deep learning framework combining attention-based transformers, Long Short-Term Memory (LSTM) networks, and Graph Convolutional Networks (GCN)
Our method achieved a correlation coefficient ($R2$) of 0.59 and a Root Mean Square Error (RMSE) of 0.57, which outperforms the benchmark models' scores of 0.52 ($R2$) and 0.61 (RMSE)
- Score: 9.018408514318631
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The accurate prediction of drug molecule solubility is essential for determining their therapeutic effectiveness and safety, influencing the drug's ADME processes. Traditional solubility prediction techniques often fail to capture the complex nature of molecular tructures, leading to notable deviations between predictions and actual results. For example, the Discussion on Advanced Drug-Like Compound Structures. Lusci highlighted issues in capturing crucial cyclic structural information in molecules with ring structures. To overcome this issue, our research introduces a novel deep learning framework combining attention-based transformers, Long Short-Term Memory (LSTM) networks, and Graph Convolutional Networks (GCN), aimed at enhancing the precision of solubility predictions. Utilizing a training set of 9,943 compounds and testing on an anticancer compound dataset, our method achieved a correlation coefficient ($R^2$) of 0.59 and a Root Mean Square Error (RMSE) of 0.57, which outperforms the benchmark models' scores of 0.52 ($R^2$) and 0.61 (RMSE). Importantly, in an additional independent test, our model significantly outperformed the baseline with an RMSE of 1.05 compared to 1.28, a relative accuracy improvement of 45.9%. This research not only demonstrates the vast potential of deep learning for improving solubility prediction accuracy but also offers novel insights for drug design and selection in the future. Continued efforts will be directed towards optimizing the model architecture and extending its application to better support the drug development process, underscoring the pivotal role of deep learning in drug discovery.
Related papers
- Regressor-free Molecule Generation to Support Drug Response Prediction [83.25894107956735]
Conditional generation based on the target IC50 score can obtain a more effective sampling space.
Regressor-free guidance combines a diffusion model's score estimation with a regression controller model's gradient based on number labels.
arXiv Detail & Related papers (2024-05-23T13:22:17Z) - Impact of Domain Knowledge and Multi-Modality on Intelligent Molecular Property Prediction: A Systematic Survey [22.73437302209673]
We review and quantitatively analyze recent deep learning methods based on various benchmarks.
We find that integrating molecular information significantly improves molecular property prediction (MPP) for both regression and classification tasks.
We also discover that enriching 2D graphs with 1D SMILES boosts multi-modal learning performance for regression tasks by up to 9.1%, and augmenting 2D graphs with 3D information increases performance for classification tasks by up to 13.2%.
arXiv Detail & Related papers (2024-02-11T17:29:58Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - Accurate, reliable and interpretable solubility prediction of druglike
molecules with attention pooling and Bayesian learning [1.8275108630751844]
In silico prediction of solubility has been studied for its utility in virtual screening and lead optimization.
Recently, machine learning (ML) methods using experimental data has been popular because physics-based methods are not suitable for high- throughput tasks.
In this paper, we develop graph neural networks (GNNs) with the self-attention readout layer to improve prediction performance.
arXiv Detail & Related papers (2022-09-29T07:48:10Z) - Improved Drug-target Interaction Prediction with Intermolecular Graph
Transformer [98.8319016075089]
We propose a novel approach to model intermolecular information with a three-way Transformer-based architecture.
Intermolecular Graph Transformer (IGT) outperforms state-of-the-art approaches by 9.1% and 20.5% over the second best for binding activity and binding pose prediction respectively.
IGT exhibits promising drug screening ability against SARS-CoV-2 by identifying 83.1% active drugs that have been validated by wet-lab experiments with near-native predicted binding poses.
arXiv Detail & Related papers (2021-10-14T13:28:02Z) - Extracting Chemical-Protein Interactions via Calibrated Deep Neural
Network and Self-training [0.8376091455761261]
"calibration" techniques have been applied to deep learning models to estimate the data uncertainty and improve the reliability.
In this study, to extract chemical--protein interactions, we propose a DNN-based approach incorporating uncertainty information and calibration techniques.
Our approach has achieved state-of-the-art performance with regard to the Biocreative VI ChemProt task, while preserving higher calibration abilities than those of previous approaches.
arXiv Detail & Related papers (2020-11-04T10:14:31Z) - Deep Learning for Virtual Screening: Five Reasons to Use ROC Cost
Functions [80.12620331438052]
deep learning has become an important tool for rapid screening of billions of molecules in silico for potential hits containing desired chemical features.
Despite its importance, substantial challenges persist in training these models, such as severe class imbalance, high decision thresholds, and lack of ground truth labels in some datasets.
We argue in favor of directly optimizing the receiver operating characteristic (ROC) in such cases, due to its robustness to class imbalance.
arXiv Detail & Related papers (2020-06-25T08:46:37Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z) - Assessing Graph-based Deep Learning Models for Predicting Flash Point [52.931492216239995]
Graph-based deep learning (GBDL) models were implemented in predicting flash point for the first time.
Average R2 and Mean Absolute Error (MAE) scores of MPNN are, respectively, 2.3% lower and 2.0 K higher than previous comparable studies.
arXiv Detail & Related papers (2020-02-26T06:10:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.