Scaling Structure Aware Virtual Screening to Billions of Molecules with SPRINT
- URL: http://arxiv.org/abs/2411.15418v2
- Date: Mon, 20 Jan 2025 22:10:33 GMT
- Title: Scaling Structure Aware Virtual Screening to Billions of Molecules with SPRINT
- Authors: Andrew T. McNutt, Abhinav K. Adduri, Caleb N. Ellington, Monica T. Dayao, Eric P. Xing, Hosein Mohimani, David R. Koes,
- Abstract summary: SPRINT is a vector-based approach for screening entire chemical libraries against whole proteomes for DTIs and novel mechanisms of action.<n>In addition to being both accurate and interpretable, SPRINT is ultra-fast: querying the whole human proteome against the ENAMINE Real Database for the 100 most likely binders per protein takes 16 minutes.
- Score: 34.0426830266348
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Virtual screening of small molecules against protein targets can accelerate drug discovery and development by predicting drug-target interactions (DTIs). However, structure-based methods like molecular docking are too slow to allow for broad proteome-scale screens, limiting their application in screening for off-target effects or new molecular mechanisms. Recently, vector-based methods using protein language models (PLMs) have emerged as a complementary approach that bypasses explicit 3D structure modeling. Here, we develop SPRINT, a vector-based approach for screening entire chemical libraries against whole proteomes for DTIs and novel mechanisms of action. SPRINT improves on prior work by using a self-attention based architecture and structure-aware PLMs to learn drug-target co-embeddings for binder prediction, search, and retrieval. SPRINT achieves SOTA enrichment factors in virtual screening on LIT-PCBA, DTI classification benchmarks, and binding affinity prediction benchmarks, while providing interpretability in the form of residue-level attention maps. In addition to being both accurate and interpretable, SPRINT is ultra-fast: querying the whole human proteome against the ENAMINE Real Database (6.7B drugs) for the 100 most likely binders per protein takes 16 minutes. SPRINT promises to enable virtual screening at an unprecedented scale, opening up new opportunities for in silico drug repurposing and development. SPRINT is available on the web as ColabScreen: https://bit.ly/colab-screen
Related papers
- SE(3)-Equivariant Ternary Complex Prediction Towards Target Protein Degradation [28.648225112411637]
Targeted protein degradation (TPD) induced by small molecules has emerged as a rapidly evolving modality in drug discovery.
DeepTernary is a novel deep learning-based approach that directly predicts ternary structures in an end-to-end manner.
arXiv Detail & Related papers (2025-02-26T06:33:24Z) - MIN: Multi-channel Interaction Network for Drug-Target Interaction with Protein Distillation [64.4838301776267]
Multi-channel Interaction Network (MIN) is a novel framework designed to predict drug-target interaction (DTI)
MIN incorporates a representation learning module and a multi-channel interaction module.
MIN is not only a potent tool for DTI prediction but also offers fresh insights into the prediction of protein binding sites.
arXiv Detail & Related papers (2024-11-23T05:38:36Z) - MSAGPT: Neural Prompting Protein Structure Prediction via MSA Generative Pre-Training [48.398329286769304]
Multiple Sequence Alignment (MSA) plays a pivotal role in unveiling the evolutionary trajectories of protein families.
MSAGPT is a novel approach to prompt protein structure predictions via MSA generative pretraining in the low MSA regime.
arXiv Detail & Related papers (2024-06-08T04:23:57Z) - FragXsiteDTI: Revealing Responsible Segments in Drug-Target Interaction
with Transformer-Driven Interpretation [0.09236074230806578]
Drug-Target Interaction (DTI) prediction is vital for drug discovery, yet challenges persist in achieving model interpretability and optimizing performance.
We propose a novel transformer-based model, FragXsiteDTI, that aims to address these challenges in DTI prediction.
FragXsiteDTI is the first DTI model to simultaneously leverage drug molecule fragments and protein pockets.
arXiv Detail & Related papers (2023-11-04T04:57:13Z) - ProFSA: Self-supervised Pocket Pretraining via Protein
Fragment-Surroundings Alignment [20.012210194899605]
We propose a novel pocket pretraining approach that leverages knowledge from high-resolution atomic protein structures.
Our method, named ProFSA, achieves state-of-the-art performance across various tasks, including pocket druggability prediction.
Our work opens up a new avenue for mitigating the scarcity of protein-ligand complex data through the utilization of high-quality and diverse protein structure databases.
arXiv Detail & Related papers (2023-10-11T06:36:23Z) - PGraphDTA: Improving Drug Target Interaction Prediction using Protein
Language Models and Contact Maps [4.590060921188914]
Key aspect of drug discovery involves identifying novel drug-target (DT) interactions.
Protein-ligand interactions exhibit a continuum of binding strengths, known as binding affinity.
We propose novel enhancements to enhance their performance.
arXiv Detail & Related papers (2023-10-06T05:00:25Z) - PharmacoNet: Accelerating Large-Scale Virtual Screening by Deep
Pharmacophore Modeling [0.0]
We describe for the first time a deep-learning framework for structure-based pharmacophore modeling to address this challenge.
PharmacoNet is significantly faster than state-of-the-art structure-based approaches, yet reasonably accurate with a simple scoring function.
arXiv Detail & Related papers (2023-10-01T14:13:09Z) - HydraScreen: A Generalizable Structure-Based Deep Learning Approach to
Drug Discovery [0.0]
HydraScreen aims to provide a framework for more robust machine-learning-accelerated drug discovery.
We use a state-of-the-art 3D convolutional neural network to represent molecular structures and interactions in protein-ligand binding.
HydraScreen provides a user-friendly GUI and a public API, facilitating easy assessment of individual protein-ligand complexes.
arXiv Detail & Related papers (2023-09-22T18:48:34Z) - Tailoring Molecules for Protein Pockets: a Transformer-based Generative
Solution for Structured-based Drug Design [133.1268990638971]
De novo drug design based on the structure of a target protein can provide novel drug candidates.
We present a generative solution named TamGent that can directly generate candidate drugs from scratch for a given target.
arXiv Detail & Related papers (2022-08-30T09:32:39Z) - SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity
Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery.
wet experiments remain the most reliable method, but they are time-consuming and resource-intensive.
Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue.
We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z) - Learning To Navigate The Synthetically Accessible Chemical Space Using
Reinforcement Learning [75.95376096628135]
We propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design.
In this setup, the agent learns to navigate through the immense synthetically accessible chemical space.
We describe how the end-to-end training in this study represents an important paradigm in radically expanding the synthesizable chemical space.
arXiv Detail & Related papers (2020-04-26T21:40:03Z) - DeepPurpose: a Deep Learning Library for Drug-Target Interaction
Prediction [69.7424023336611]
DeepPurpose is a comprehensive and easy-to-use deep learning library for DTI prediction.
It supports training of customized DTI prediction models by implementing 15 compound and protein encoders and over 50 neural architectures.
We demonstrate state-of-the-art performance of DeepPurpose on several benchmark datasets.
arXiv Detail & Related papers (2020-04-19T17:31:55Z) - DeepGS: Deep Representation Learning of Graphs and Sequences for
Drug-Target Binding Affinity Prediction [8.292330541203647]
We propose a novel end-to-end learning framework, called DeepGS, which uses deep neural networks to extract the local chemical context from amino acids and SMILES sequences.
We have conducted extensive experiments to compare our proposed method with state-of-the-art models including KronRLS, Sim, DeepDTA and DeepCPI.
arXiv Detail & Related papers (2020-03-31T01:35:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.