SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity
Prediction
- URL: http://arxiv.org/abs/2206.09818v3
- Date: Tue, 17 Oct 2023 14:06:07 GMT
- Title: SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity
Prediction
- Authors: Qizhi Pei, Lijun Wu, Jinhua Zhu, Yingce Xia, Shufang Xie, Tao Qin,
Haiguang Liu, Tie-Yan Liu, Rui Yan
- Abstract summary: Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery.
wet experiments remain the most reliable method, but they are time-consuming and resource-intensive.
Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue.
We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
- Score: 127.43571146741984
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate prediction of Drug-Target Affinity (DTA) is of vital importance in
early-stage drug discovery, facilitating the identification of drugs that can
effectively interact with specific targets and regulate their activities. While
wet experiments remain the most reliable method, they are time-consuming and
resource-intensive, resulting in limited data availability that poses
challenges for deep learning approaches. Existing methods have primarily
focused on developing techniques based on the available DTA data, without
adequately addressing the data scarcity issue. To overcome this challenge, we
present the SSM-DTA framework, which incorporates three simple yet highly
effective strategies: (1) A multi-task training approach that combines DTA
prediction with masked language modeling (MLM) using paired drug-target data.
(2) A semi-supervised training method that leverages large-scale unpaired
molecules and proteins to enhance drug and target representations. This
approach differs from previous methods that only employed molecules or proteins
in pre-training. (3) The integration of a lightweight cross-attention module to
improve the interaction between drugs and targets, further enhancing prediction
accuracy. Through extensive experiments on benchmark datasets such as
BindingDB, DAVIS, and KIBA, we demonstrate the superior performance of our
framework. Additionally, we conduct case studies on specific drug-target
binding activities, virtual screening experiments, drug feature visualizations,
and real-world applications, all of which showcase the significant potential of
our work. In conclusion, our proposed SSM-DTA framework addresses the data
limitation challenge in DTA prediction and yields promising results, paving the
way for more efficient and accurate drug discovery processes. Our code is
available at $\href{https://github.com/QizhiPei/SSM-DTA}{Github}$.
Related papers
- GramSeq-DTA: A grammar-based drug-target affinity prediction approach fusing gene expression information [1.2289361708127877]
We propose GramSeq-DTA, which integrates chemical perturbation information with the structural information of drugs and targets.
Our approach outperforms the current state-of-the-art DTA prediction models when validated on widely used DTA datasets.
arXiv Detail & Related papers (2024-11-03T03:17:09Z) - SMILES-Mamba: Chemical Mamba Foundation Models for Drug ADMET Prediction [16.189335444981353]
Predicting the absorption, distribution, metabolism, excretion, and toxicity of small-molecule drugs is critical for ensuring safety and efficacy.
We propose a two-stage model that leverages both unlabeled and labeled data through a combination of self-supervised pretraining and fine-tuning strategies.
Our results demonstrate that SMILES-Mamba exhibits competitive performance across 22 ADMET datasets, achieving the highest score in 14 tasks.
arXiv Detail & Related papers (2024-08-11T04:53:12Z) - Extracting Training Data from Unconditional Diffusion Models [76.85077961718875]
diffusion probabilistic models (DPMs) are being employed as mainstream models for generative artificial intelligence (AI)
We aim to establish a theoretical understanding of memorization in DPMs with 1) a memorization metric for theoretical analysis, 2) an analysis of conditional memorization with informative and random labels, and 3) two better evaluation metrics for measuring memorization.
Based on the theoretical analysis, we propose a novel data extraction method called textbfSurrogate condItional Data Extraction (SIDE) that leverages a trained on generated data as a surrogate condition to extract training data directly from unconditional diffusion models.
arXiv Detail & Related papers (2024-06-18T16:20:12Z) - A Cross-Field Fusion Strategy for Drug-Target Interaction Prediction [85.2792480737546]
Existing methods fail to utilize global protein information during DTI prediction.
Cross-field information fusion strategy is employed to acquire local and global protein information.
Siamese drug-target interaction SiamDTI prediction method achieves higher accuracy levels than other state-of-the-art (SOTA) methods on novel drugs and targets.
arXiv Detail & Related papers (2024-05-23T13:25:20Z) - DDIPrompt: Drug-Drug Interaction Event Prediction based on Graph Prompt Learning [15.69547371747469]
DDIPrompt is an innovative solution inspired by the recent advancements in graph prompt learning.
Our framework aims to address these issues by leveraging intrinsic the knowledge from pre-trained models.
Extensive experiments on two benchmark datasets demonstrate DDIPrompt's SOTA performance.
arXiv Detail & Related papers (2024-02-18T06:22:01Z) - PGraphDTA: Improving Drug Target Interaction Prediction using Protein
Language Models and Contact Maps [4.590060921188914]
Key aspect of drug discovery involves identifying novel drug-target (DT) interactions.
Protein-ligand interactions exhibit a continuum of binding strengths, known as binding affinity.
We propose novel enhancements to enhance their performance.
arXiv Detail & Related papers (2023-10-06T05:00:25Z) - Zero-shot Learning of Drug Response Prediction for Preclinical Drug
Screening [38.94493676651818]
We propose a zero-shot learning solution for the.
task in preclinical drug screening.
Specifically, we propose a Multi-branch Multi-Source Domain Adaptation Test Enhancement Plug-in, called MSDA.
arXiv Detail & Related papers (2023-10-05T05:55:41Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - Multi-View Substructure Learning for Drug-Drug Interaction Prediction [69.34322811160912]
We propose a novel multi- view drug substructure network for DDI prediction (MSN-DDI)
MSN-DDI learns chemical substructures from both the representations of the single drug (intra-view) and the drug pair (inter-view) simultaneously and utilizes the substructures to update the drug representation iteratively.
Comprehensive evaluations demonstrate that MSN-DDI has almost solved DDI prediction for existing drugs by achieving a relatively improved accuracy of 19.32% and an over 99% accuracy under the transductive setting.
arXiv Detail & Related papers (2022-03-28T05:44:29Z) - DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for
AI-aided Drug Discovery -- A Focus on Affinity Prediction Problems with Noise
Annotations [90.27736364704108]
We present DrugOOD, a systematic OOD dataset curator and benchmark for AI-aided drug discovery.
DrugOOD comes with an open-source Python package that fully automates benchmarking processes.
We focus on one of the most crucial problems in AIDD: drug target binding affinity prediction.
arXiv Detail & Related papers (2022-01-24T12:32:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.