AANet: Virtual Screening under Structural Uncertainty via Alignment and Aggregation
- URL: http://arxiv.org/abs/2506.05768v1
- Date: Fri, 06 Jun 2025 05:52:19 GMT
- Title: AANet: Virtual Screening under Structural Uncertainty via Alignment and Aggregation
- Authors: Wenyu Zhu, Jianhui Wang, Bowen Gao, Yinjun Jia, Haichuan Tan, Ya-Qin Zhang, Wei-Ying Ma, Yanyan Lan,
- Abstract summary: We introduce an alignment-and-aggregation framework to enable accurate virtual screening under structural uncertainty.<n>We evaluate our method on a newly curated benchmark of apo structures, where it significantly outperforms state-of-the-art methods in blind apo setting.
- Score: 18.8920680373474
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Virtual screening (VS) is a critical component of modern drug discovery, yet most existing methods--whether physics-based or deep learning-based--are developed around holo protein structures with known ligand-bound pockets. Consequently, their performance degrades significantly on apo or predicted structures such as those from AlphaFold2, which are more representative of real-world early-stage drug discovery, where pocket information is often missing. In this paper, we introduce an alignment-and-aggregation framework to enable accurate virtual screening under structural uncertainty. Our method comprises two core components: (1) a tri-modal contrastive learning module that aligns representations of the ligand, the holo pocket, and cavities detected from structures, thereby enhancing robustness to pocket localization error; and (2) a cross-attention based adapter for dynamically aggregating candidate binding sites, enabling the model to learn from activity data even without precise pocket annotations. We evaluated our method on a newly curated benchmark of apo structures, where it significantly outperforms state-of-the-art methods in blind apo setting, improving the early enrichment factor (EF1%) from 11.75 to 37.19. Notably, it also maintains strong performance on holo structures. These results demonstrate the promise of our approach in advancing first-in-class drug discovery, particularly in scenarios lacking experimentally resolved protein-ligand complexes.
Related papers
- DISPROTBENCH: A Disorder-Aware, Task-Rich Benchmark for Evaluating Protein Structure Prediction in Realistic Biological Contexts [76.59606029593085]
DisProtBench is a benchmark for evaluating protein structure prediction models (PSPMs) under structural disorder and complex biological conditions.<n>DisProtBench spans three key axes: data complexity, task diversity, and Interpretability.<n>Results reveal significant variability in model robustness under disorder, with low-confidence regions linked to functional prediction failures.
arXiv Detail & Related papers (2025-06-18T23:58:22Z) - AlphaFold Database Debiasing for Robust Inverse Folding [58.792020809180336]
We introduce a Debiasing Structure AutoEncoder (DeSAE) that learns to reconstruct native-like conformations from intentionally corrupted backbone geometries.<n>At inference, applying DeSAE to AFDB structures produces debiased structures that significantly improve inverse folding performance.
arXiv Detail & Related papers (2025-06-10T02:25:31Z) - SE(3)-Equivariant Ternary Complex Prediction Towards Target Protein Degradation [28.648225112411637]
Targeted protein degradation (TPD) induced by small molecules has emerged as a rapidly evolving modality in drug discovery.<n>DeepTernary is a novel deep learning-based approach that directly predicts ternary structures in an end-to-end manner.
arXiv Detail & Related papers (2025-02-26T06:33:24Z) - Fast and Accurate Blind Flexible Docking [79.88520988144442]
Molecular docking that predicts the bound structures of small molecules (ligands) to their protein targets plays a vital role in drug discovery.<n>We propose FABFlex, a fast and accurate regression-based multi-task learning model designed for realistic blind flexible docking scenarios.
arXiv Detail & Related papers (2025-02-20T07:31:13Z) - One-step Structure Prediction and Screening for Protein-Ligand Complexes using Multi-Task Geometric Deep Learning [6.605588716386855]
We show that LigPose can be accurately tackled with a single model, namely LigPose, based on multi-task geometric deep learning.
LigPose represents the ligand and the protein pair as a graph, with the learning of binding strength and atomic interactions as auxiliary tasks.
Experiments show LigPose achieved state-of-the-art performance on major tasks in drug research.
arXiv Detail & Related papers (2024-08-21T05:53:50Z) - Fast and Reliable Probabilistic Reflectometry Inversion with Prior-Amortized Neural Posterior Estimation [73.81105275628751]
Finding all structures compatible with reflectometry data is computationally prohibitive for standard algorithms.
We address this lack of reliability with a probabilistic deep learning method that identifies all realistic structures in seconds.
Our method, Prior-Amortized Neural Posterior Estimation (PANPE), combines simulation-based inference with novel adaptive priors.
arXiv Detail & Related papers (2024-07-26T10:29:16Z) - ProFSA: Self-supervised Pocket Pretraining via Protein
Fragment-Surroundings Alignment [20.012210194899605]
We propose a novel pocket pretraining approach that leverages knowledge from high-resolution atomic protein structures.
Our method, named ProFSA, achieves state-of-the-art performance across various tasks, including pocket druggability prediction.
Our work opens up a new avenue for mitigating the scarcity of protein-ligand complex data through the utilization of high-quality and diverse protein structure databases.
arXiv Detail & Related papers (2023-10-11T06:36:23Z) - PharmacoNet: Accelerating Large-Scale Virtual Screening by Deep
Pharmacophore Modeling [0.0]
We describe for the first time a deep-learning framework for structure-based pharmacophore modeling to address this challenge.
PharmacoNet is significantly faster than state-of-the-art structure-based approaches, yet reasonably accurate with a simple scoring function.
arXiv Detail & Related papers (2023-10-01T14:13:09Z) - Enhancing Infrared Small Target Detection Robustness with Bi-Level
Adversarial Framework [61.34862133870934]
We propose a bi-level adversarial framework to promote the robustness of detection in the presence of distinct corruptions.
Our scheme remarkably improves 21.96% IOU across a wide array of corruptions and notably promotes 4.97% IOU on the general benchmark.
arXiv Detail & Related papers (2023-09-03T06:35:07Z) - Transfer Learning for Protein Structure Classification at Low Resolution [124.5573289131546]
We show that it is possible to make accurate ($geq$80%) predictions of protein class and architecture from structures determined at low ($leq$3A) resolution.
We provide proof of concept for high-speed, low-cost protein structure classification at low resolution, and a basis for extension to prediction of function.
arXiv Detail & Related papers (2020-08-11T15:01:32Z) - Deep Learning for Virtual Screening: Five Reasons to Use ROC Cost
Functions [80.12620331438052]
deep learning has become an important tool for rapid screening of billions of molecules in silico for potential hits containing desired chemical features.
Despite its importance, substantial challenges persist in training these models, such as severe class imbalance, high decision thresholds, and lack of ground truth labels in some datasets.
We argue in favor of directly optimizing the receiver operating characteristic (ROC) in such cases, due to its robustness to class imbalance.
arXiv Detail & Related papers (2020-06-25T08:46:37Z) - Explainable Deep Relational Networks for Predicting Compound-Protein
Affinities and Contacts [80.69440684790925]
DeepRelations is a physics-inspired deep relational network with intrinsically explainable architecture.
It shows superior interpretability to the state-of-the-art.
It boosts the AUPRC of contact prediction 9.5, 16.9, 19.3 and 5.7-fold for the test, compound-unique, protein-unique, and both-unique sets.
arXiv Detail & Related papers (2019-12-29T00:14:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.