Dockformer: A transformer-based molecular docking paradigm for large-scale virtual screening
- URL: http://arxiv.org/abs/2411.06740v4
- Date: Thu, 05 Dec 2024 14:56:30 GMT
- Title: Dockformer: A transformer-based molecular docking paradigm for large-scale virtual screening
- Authors: Zhangfan Yang, Junkai Ji, Shan He, Jianqiang Li, Tiantian He, Ruibin Bai, Zexuan Zhu, Yew Soon Ong,
- Abstract summary: Deep learning algorithms can provide data-driven research and development models to increase the speed of the docking process.
A novel deep learning-based docking approach named Dockformer is introduced in this study.
The experimental results show that Dockformer achieves success rates of 90.53% and 82.71% on the PDBbind core set and PoseBusters benchmarks, respectively.
- Score: 29.947687129449278
- License:
- Abstract: Molecular docking is a crucial step in drug development, which enables the virtual screening of compound libraries to identify potential ligands that target proteins of interest. However, the computational complexity of traditional docking models increases as the size of the compound library increases. Recently, deep learning algorithms can provide data-driven research and development models to increase the speed of the docking process. Unfortunately, few models can achieve superior screening performance compared to that of traditional models. Therefore, a novel deep learning-based docking approach named Dockformer is introduced in this study. Dockformer leverages multimodal information to capture the geometric topology and structural knowledge of molecules and can directly generate binding conformations with the corresponding confidence measures in an end-to-end manner. The experimental results show that Dockformer achieves success rates of 90.53% and 82.71% on the PDBbind core set and PoseBusters benchmarks, respectively, and more than a 100-fold increase in the inference process speed, outperforming almost all state-of-the-art docking methods. In addition, the ability of Dockformer to identify the main protease inhibitors of coronaviruses is demonstrated in a real-world virtual screening scenario. Considering its high docking accuracy and screening efficiency, Dockformer can be regarded as a powerful and robust tool in the field of drug design.
Related papers
- Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection.
We design a forgery-style mixture formulation that augments the diversity of forgery source domains.
We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z) - Smiles2Dock: an open large-scale multi-task dataset for ML-based molecular docking [0.0]
We introduce Smiles2Dock, an open large-scale multi-task dataset for molecular docking.
We dock 1.7 million from the ChEMBL database against 15 AlphaFold proteins, giving us more than 25 million protein-ligand binding scores.
Our dataset and code are publicly available to support the development of novel ML-based methods for molecular docking.
arXiv Detail & Related papers (2024-06-09T11:13:03Z) - RGFN: Synthesizable Molecular Generation Using GFlowNets [51.33672611338754]
We propose Reaction-GFlowNet, an extension of the GFlowNet framework that operates directly in the space of chemical reactions.
RGFN allows out-of-the-box synthesizability while maintaining comparable quality of generated candidates.
We demonstrate the effectiveness of the proposed approach across a range of oracle models, including pretrained proxy models and GPU-accelerated docking.
arXiv Detail & Related papers (2024-06-01T13:11:11Z) - Re-Dock: Towards Flexible and Realistic Molecular Docking with Diffusion
Bridge [69.80471117520719]
Re-Dock is a novel diffusion bridge generative model extended to geometric manifold.
We propose energy-to-geometry mapping inspired by the Newton-Euler equation to co-model the binding energy and conformations.
Experiments on designed benchmark datasets including apo-dock and cross-dock demonstrate our model's superior effectiveness and efficiency over current methods.
arXiv Detail & Related papers (2024-02-18T05:04:50Z) - ECC-PolypDet: Enhanced CenterNet with Contrastive Learning for Automatic
Polyp Detection [88.4359020192429]
Existing methods either involve computationally expensive context aggregation or lack prior modeling of polyps, resulting in poor performance in challenging cases.
In this paper, we propose the Enhanced CenterNet with Contrastive Learning (ECC-PolypDet), a two-stage training & end-to-end inference framework.
Box-assisted Contrastive Learning (BCL) during training to minimize the intra-class difference and maximize the inter-class difference between foreground polyps and backgrounds, enabling our model to capture concealed polyps.
In the fine-tuning stage, we introduce the IoU-guided Sample Re-weighting
arXiv Detail & Related papers (2024-01-10T07:03:41Z) - Multi-scale Iterative Refinement towards Robust and Versatile Molecular
Docking [17.28573902701018]
Molecular docking is a key computational tool utilized to predict the binding conformations of small molecules to protein targets.
We introduce DeltaDock, a robust and versatile framework designed for efficient molecular docking.
arXiv Detail & Related papers (2023-11-30T14:09:20Z) - Pre-Training on Large-Scale Generated Docking Conformations with HelixDock to Unlock the Potential of Protein-ligand Structure Prediction Models [42.16524616409125]
In this work, we show that by pre-training on a large-scale docking conformation, we can obtain a protein-ligand structure prediction model with outstanding performance.
The proposed model, HelixDock, aims to acquire the physical knowledge encapsulated by the physics-based docking tools during the pre-training phase.
arXiv Detail & Related papers (2023-10-21T05:54:26Z) - Deep Surrogate Docking: Accelerating Automated Drug Discovery with Graph
Neural Networks [0.9785311158871759]
We introduce Deep Surrogate Docking (DSD), a framework that applies deep learning-based surrogate modeling to accelerate the docking process substantially.
We show that the DSD workflow combined with the FiLMv2 architecture provides a 9.496x speedup in molecule screening with a 3% recall error rate.
arXiv Detail & Related papers (2022-11-04T19:36:02Z) - DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking [28.225704750892795]
Predicting the binding structure of a small molecule ligand to a protein is critical to drug design.
Recent deep learning methods that treat docking as a regression problem have decreased runtime compared to traditional search-based methods.
We frame molecular docking as a generative modeling problem and develop DiffDock, a diffusion generative model over the non-Euclidean manifold of ligand poses.
arXiv Detail & Related papers (2022-10-04T17:38:14Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - DepthFormer: Exploiting Long-Range Correlation and Local Information for
Accurate Monocular Depth Estimation [50.08080424613603]
Long-range correlation is essential for accurate monocular depth estimation.
We propose to leverage the Transformer to model this global context with an effective attention mechanism.
Our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins.
arXiv Detail & Related papers (2022-03-27T05:03:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.