Scaling Structure Aware Virtual Screening to Billions of Molecules with SPRINT
- URL: http://arxiv.org/abs/2411.15418v2
- Date: Mon, 20 Jan 2025 22:10:33 GMT
- Title: Scaling Structure Aware Virtual Screening to Billions of Molecules with SPRINT
- Authors: Andrew T. McNutt, Abhinav K. Adduri, Caleb N. Ellington, Monica T. Dayao, Eric P. Xing, Hosein Mohimani, David R. Koes,
- Abstract summary: SPRINT is a vector-based approach for screening entire chemical libraries against whole proteomes for DTIs and novel mechanisms of action.
In addition to being both accurate and interpretable, SPRINT is ultra-fast: querying the whole human proteome against the ENAMINE Real Database for the 100 most likely binders per protein takes 16 minutes.
- Score: 34.0426830266348
- License:
- Abstract: Virtual screening of small molecules against protein targets can accelerate drug discovery and development by predicting drug-target interactions (DTIs). However, structure-based methods like molecular docking are too slow to allow for broad proteome-scale screens, limiting their application in screening for off-target effects or new molecular mechanisms. Recently, vector-based methods using protein language models (PLMs) have emerged as a complementary approach that bypasses explicit 3D structure modeling. Here, we develop SPRINT, a vector-based approach for screening entire chemical libraries against whole proteomes for DTIs and novel mechanisms of action. SPRINT improves on prior work by using a self-attention based architecture and structure-aware PLMs to learn drug-target co-embeddings for binder prediction, search, and retrieval. SPRINT achieves SOTA enrichment factors in virtual screening on LIT-PCBA, DTI classification benchmarks, and binding affinity prediction benchmarks, while providing interpretability in the form of residue-level attention maps. In addition to being both accurate and interpretable, SPRINT is ultra-fast: querying the whole human proteome against the ENAMINE Real Database (6.7B drugs) for the 100 most likely binders per protein takes 16 minutes. SPRINT promises to enable virtual screening at an unprecedented scale, opening up new opportunities for in silico drug repurposing and development. SPRINT is available on the web as ColabScreen: https://bit.ly/colab-screen
Related papers
- MIN: Multi-channel Interaction Network for Drug-Target Interaction with Protein Distillation [64.4838301776267]
Multi-channel Interaction Network (MIN) is a novel framework designed to predict drug-target interaction (DTI)
MIN incorporates a representation learning module and a multi-channel interaction module.
MIN is not only a potent tool for DTI prediction but also offers fresh insights into the prediction of protein binding sites.
arXiv Detail & Related papers (2024-11-23T05:38:36Z) - FragXsiteDTI: Revealing Responsible Segments in Drug-Target Interaction
with Transformer-Driven Interpretation [0.09236074230806578]
Drug-Target Interaction (DTI) prediction is vital for drug discovery, yet challenges persist in achieving model interpretability and optimizing performance.
We propose a novel transformer-based model, FragXsiteDTI, that aims to address these challenges in DTI prediction.
FragXsiteDTI is the first DTI model to simultaneously leverage drug molecule fragments and protein pockets.
arXiv Detail & Related papers (2023-11-04T04:57:13Z) - PharmacoNet: Accelerating Large-Scale Virtual Screening by Deep
Pharmacophore Modeling [0.0]
We describe for the first time a deep-learning framework for structure-based pharmacophore modeling to address this challenge.
PharmacoNet is significantly faster than state-of-the-art structure-based approaches, yet reasonably accurate with a simple scoring function.
arXiv Detail & Related papers (2023-10-01T14:13:09Z) - HydraScreen: A Generalizable Structure-Based Deep Learning Approach to
Drug Discovery [0.0]
HydraScreen aims to provide a framework for more robust machine-learning-accelerated drug discovery.
We use a state-of-the-art 3D convolutional neural network to represent molecular structures and interactions in protein-ligand binding.
HydraScreen provides a user-friendly GUI and a public API, facilitating easy assessment of individual protein-ligand complexes.
arXiv Detail & Related papers (2023-09-22T18:48:34Z) - A Methodology for the Prediction of Drug Target Interaction using CDK
Descriptors [0.0]
We propose a DTI prediction model built on molecular structure of drugs and sequence of target proteins.
In the proposed model, we use CDK descriptors, Molecular ACCess System (MACCS) fingerprints, Electrotopological state (Estate) fingerprints and amino acid sequences of targets to get Pseudo Amino Acid Composition (PseAAC)
arXiv Detail & Related papers (2022-10-20T09:25:14Z) - Tailoring Molecules for Protein Pockets: a Transformer-based Generative
Solution for Structured-based Drug Design [133.1268990638971]
De novo drug design based on the structure of a target protein can provide novel drug candidates.
We present a generative solution named TamGent that can directly generate candidate drugs from scratch for a given target.
arXiv Detail & Related papers (2022-08-30T09:32:39Z) - SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity
Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery.
wet experiments remain the most reliable method, but they are time-consuming and resource-intensive.
Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue.
We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z) - Learning To Navigate The Synthetically Accessible Chemical Space Using
Reinforcement Learning [75.95376096628135]
We propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design.
In this setup, the agent learns to navigate through the immense synthetically accessible chemical space.
We describe how the end-to-end training in this study represents an important paradigm in radically expanding the synthesizable chemical space.
arXiv Detail & Related papers (2020-04-26T21:40:03Z) - DeepPurpose: a Deep Learning Library for Drug-Target Interaction
Prediction [69.7424023336611]
DeepPurpose is a comprehensive and easy-to-use deep learning library for DTI prediction.
It supports training of customized DTI prediction models by implementing 15 compound and protein encoders and over 50 neural architectures.
We demonstrate state-of-the-art performance of DeepPurpose on several benchmark datasets.
arXiv Detail & Related papers (2020-04-19T17:31:55Z) - DeepGS: Deep Representation Learning of Graphs and Sequences for
Drug-Target Binding Affinity Prediction [8.292330541203647]
We propose a novel end-to-end learning framework, called DeepGS, which uses deep neural networks to extract the local chemical context from amino acids and SMILES sequences.
We have conducted extensive experiments to compare our proposed method with state-of-the-art models including KronRLS, Sim, DeepDTA and DeepCPI.
arXiv Detail & Related papers (2020-03-31T01:35:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.