Related papers: Dockformer: A transformer-based molecular docking paradigm for large-scale virtual screening

Dockformer: A transformer-based molecular docking paradigm for large-scale virtual screening

URL: http://arxiv.org/abs/2411.06740v4
Date: Thu, 05 Dec 2024 14:56:30 GMT
Title: Dockformer: A transformer-based molecular docking paradigm for large-scale virtual screening
Authors: Zhangfan Yang, Junkai Ji, Shan He, Jianqiang Li, Tiantian He, Ruibin Bai, Zexuan Zhu, Yew Soon Ong,
Abstract summary: Deep learning algorithms can provide data-driven research and development models to increase the speed of the docking process.<n>A novel deep learning-based docking approach named Dockformer is introduced in this study.<n>The experimental results show that Dockformer achieves success rates of 90.53% and 82.71% on the PDBbind core set and PoseBusters benchmarks, respectively.
Score: 29.947687129449278
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Molecular docking is a crucial step in drug development, which enables the virtual screening of compound libraries to identify potential ligands that target proteins of interest. However, the computational complexity of traditional docking models increases as the size of the compound library increases. Recently, deep learning algorithms can provide data-driven research and development models to increase the speed of the docking process. Unfortunately, few models can achieve superior screening performance compared to that of traditional models. Therefore, a novel deep learning-based docking approach named Dockformer is introduced in this study. Dockformer leverages multimodal information to capture the geometric topology and structural knowledge of molecules and can directly generate binding conformations with the corresponding confidence measures in an end-to-end manner. The experimental results show that Dockformer achieves success rates of 90.53% and 82.71% on the PDBbind core set and PoseBusters benchmarks, respectively, and more than a 100-fold increase in the inference process speed, outperforming almost all state-of-the-art docking methods. In addition, the ability of Dockformer to identify the main protease inhibitors of coronaviruses is demonstrated in a real-world virtual screening scenario. Considering its high docking accuracy and screening efficiency, Dockformer can be regarded as a powerful and robust tool in the field of drug design.

Related papers

PoseX: AI Defeats Physics Approaches on Protein-Ligand Cross Docking [74.76447568426276]
PoseX is an open-source benchmark to evaluate both self-docking and cross-docking.<n>We incorporated 23 docking methods in three methodological categories.<n>We developed a relaxation method for post-processing to minimize conformational energy and refine binding poses.
arXiv Detail & Related papers (2025-05-03T05:35:37Z)
Fast and Accurate Blind Flexible Docking [79.88520988144442]
Molecular docking that predicts the bound structures of small molecules (ligands) to their protein targets plays a vital role in drug discovery. We propose FABFlex, a fast and accurate regression-based multi-task learning model designed for realistic blind flexible docking scenarios.
arXiv Detail & Related papers (2025-02-20T07:31:13Z)
Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection. We design a forgery-style mixture formulation that augments the diversity of forgery source domains. We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z)
Scoreformer: A Surrogate Model For Large-Scale Prediction of Docking Scores [0.0]
We present ScoreFormer, a novel graph transformer model designed to accurately predict molecular docking scores. ScoreFormer achieves competitive performance in docking score prediction and offers a substantial 1.65-fold reduction in inference time compared to existing models.
arXiv Detail & Related papers (2024-06-13T17:31:02Z)
Smiles2Dock: an open large-scale multi-task dataset for ML-based molecular docking [0.0]
We introduce Smiles2Dock, an open large-scale multi-task dataset for molecular docking. We dock 1.7 million from the ChEMBL database against 15 AlphaFold proteins, giving us more than 25 million protein-ligand binding scores. Our dataset and code are publicly available to support the development of novel ML-based methods for molecular docking.
arXiv Detail & Related papers (2024-06-09T11:13:03Z)
RGFN: Synthesizable Molecular Generation Using GFlowNets [51.33672611338754]
We propose Reaction-GFlowNet, an extension of the GFlowNet framework that operates directly in the space of chemical reactions. RGFN allows out-of-the-box synthesizability while maintaining comparable quality of generated candidates. We demonstrate the effectiveness of the proposed approach across a range of oracle models, including pretrained proxy models and GPU-accelerated docking.
arXiv Detail & Related papers (2024-06-01T13:11:11Z)
Re-Dock: Towards Flexible and Realistic Molecular Docking with Diffusion Bridge [69.80471117520719]
Re-Dock is a novel diffusion bridge generative model extended to geometric manifold. We propose energy-to-geometry mapping inspired by the Newton-Euler equation to co-model the binding energy and conformations. Experiments on designed benchmark datasets including apo-dock and cross-dock demonstrate our model's superior effectiveness and efficiency over current methods.
arXiv Detail & Related papers (2024-02-18T05:04:50Z)
ECC-PolypDet: Enhanced CenterNet with Contrastive Learning for Automatic Polyp Detection [88.4359020192429]
Existing methods either involve computationally expensive context aggregation or lack prior modeling of polyps, resulting in poor performance in challenging cases. In this paper, we propose the Enhanced CenterNet with Contrastive Learning (ECC-PolypDet), a two-stage training & end-to-end inference framework. Box-assisted Contrastive Learning (BCL) during training to minimize the intra-class difference and maximize the inter-class difference between foreground polyps and backgrounds, enabling our model to capture concealed polyps. In the fine-tuning stage, we introduce the IoU-guided Sample Re-weighting
arXiv Detail & Related papers (2024-01-10T07:03:41Z)
Multi-scale Iterative Refinement towards Robust and Versatile Molecular Docking [17.28573902701018]
Molecular docking is a key computational tool utilized to predict the binding conformations of small molecules to protein targets. We introduce DeltaDock, a robust and versatile framework designed for efficient molecular docking.
arXiv Detail & Related papers (2023-11-30T14:09:20Z)
Pre-Training on Large-Scale Generated Docking Conformations with HelixDock to Unlock the Potential of Protein-ligand Structure Prediction Models [42.16524616409125]
In this work, we show that by pre-training on a large-scale docking conformation, we can obtain a protein-ligand structure prediction model with outstanding performance. The proposed model, HelixDock, aims to acquire the physical knowledge encapsulated by the physics-based docking tools during the pre-training phase.
arXiv Detail & Related papers (2023-10-21T05:54:26Z)
Deep Surrogate Docking: Accelerating Automated Drug Discovery with Graph Neural Networks [0.9785311158871759]
We introduce Deep Surrogate Docking (DSD), a framework that applies deep learning-based surrogate modeling to accelerate the docking process substantially. We show that the DSD workflow combined with the FiLMv2 architecture provides a 9.496x speedup in molecule screening with a 3% recall error rate.
arXiv Detail & Related papers (2022-11-04T19:36:02Z)
DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking [28.225704750892795]
Predicting the binding structure of a small molecule ligand to a protein is critical to drug design. Recent deep learning methods that treat docking as a regression problem have decreased runtime compared to traditional search-based methods. We frame molecular docking as a generative modeling problem and develop DiffDock, a diffusion generative model over the non-Euclidean manifold of ligand poses.
arXiv Detail & Related papers (2022-10-04T17:38:14Z)
Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation. GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z)
DepthFormer: Exploiting Long-Range Correlation and Local Information for Accurate Monocular Depth Estimation [50.08080424613603]
Long-range correlation is essential for accurate monocular depth estimation. We propose to leverage the Transformer to model this global context with an effective attention mechanism. Our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins.
arXiv Detail & Related papers (2022-03-27T05:03:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.