Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design
- URL: http://arxiv.org/abs/2601.09693v1
- Date: Wed, 14 Jan 2026 18:45:08 GMT
- Title: Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design
- Authors: Lisa Schneckenreiter, Sohvi Luukkonen, Lukas Friedrich, Daniel Kuhn, Günter Klambauer,
- Abstract summary: We introduce Contrastive Geometric Learning for Unified Drug Design (ConGLUDe)<n>ConGLUDe unifies predicted structure- and ligand-based training.<n>It supports virtual screening and target fishing, while being trained jointly on protein-ligand complexes and large-scale bioactivity data.
- Score: 8.578932742190862
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Structure-based and ligand-based computational drug design have traditionally relied on disjoint data sources and modeling assumptions, limiting their joint use at scale. In this work, we introduce Contrastive Geometric Learning for Unified Computational Drug Design (ConGLUDe), a single contrastive geometric model that unifies structure- and ligand-based training. ConGLUDe couples a geometric protein encoder that produces whole-protein representations and implicit embeddings of predicted binding sites with a fast ligand encoder, removing the need for pre-defined pockets. By aligning ligands with both global protein representations and multiple candidate binding sites through contrastive learning, ConGLUDe supports ligand-conditioned pocket prediction in addition to virtual screening and target fishing, while being trained jointly on protein-ligand complexes and large-scale bioactivity data. Across diverse benchmarks, ConGLUDe achieves state-of-the-art zero-shot virtual screening performance in settings where no binding pocket information is provided as input, substantially outperforms existing methods on a challenging target fishing task, and demonstrates competitive ligand-conditioned pocket selection. These results highlight the advantages of unified structure-ligand training and position ConGLUDe as a step toward general-purpose foundation models for drug discovery.
Related papers
- Rigidity-Aware Geometric Pretraining for Protein Design and Conformational Ensembles [74.32932832937618]
We introduce $textbfRigidSSL$ ($textitRigidity-Aware Self-Supervised Learning$), a geometric pretraining framework.<n>Phase I (RigidSSL-Perturb) learns geometric priors from 432K structures from the AlphaFold Protein Structure Database with simulated perturbations.<n>Phase II (RigidSSL-MD) refines these representations on 1.3K molecular dynamics trajectories to capture physically realistic transitions.
arXiv Detail & Related papers (2026-03-02T21:32:30Z) - Knowledge Graphs as Structured Memory for Embedding Spaces: From Training Clusters to Explainable Inference [3.2945446636945963]
Graph Memory (GM) is a structured non-parametric framework that augments embedding-based inference with a compact, relational memory over region-level prototypes.<n>By explicitly modeling reliability and relational structure, GM provides a principled bridge between local evidence and global consistency in non-parametric learning.
arXiv Detail & Related papers (2025-11-18T23:02:59Z) - scMRDR: A scalable and flexible framework for unpaired single-cell multi-omics data integration [53.683726781791385]
We introduce a scalable and flexible generative framework called single-cell Multi-omics Regularized Disentangled Representations (scMRDR) for unpaired multi-omics integration.<n>Our method achieves excellent performance on benchmark datasets in terms of batch correction, modality alignment, and biological signal preservation.
arXiv Detail & Related papers (2025-10-28T21:28:39Z) - Structure-Aware Contrastive Learning with Fine-Grained Binding Representations for Drug Discovery [3.1716746406651457]
This work introduces a sequence-based drug-target interaction framework that integrates structural priors into protein representations.<n>The model achieves state-of-the-art performance on Human and BioSNAP datasets and remains competitive on BindingDB.
arXiv Detail & Related papers (2025-09-18T09:38:46Z) - A Geometric Graph-Based Deep Learning Model for Drug-Target Affinity Prediction [0.0]
We introduce DeepGGL, a deep convolutional neural network that integrates residual connections and an attention mechanism within a geometric graph learning framework.<n>By leveraging multiscale weighted colored bipartite subgraphs, DeepGGL effectively captures fine-grained atom-level interactions in protein-ligand complexes across multiple scales.<n>DeepGGL consistently maintained high predictive accuracy, highlighting its adaptability and reliability for binding affinity prediction in structure-based drug discovery.
arXiv Detail & Related papers (2025-09-15T14:06:39Z) - CAME-AB: Cross-Modality Attention with Mixture-of-Experts for Antibody Binding Site Prediction [9.316793780511917]
bfCAME-AB is a novel Cross-modality Attention framework for antibody binding site prediction.<n>It integrates raw acid encodings, BLOSUM substitution profiles, pretrained language model embeddings, structure-aware features, and biochemical graphs.<n>It consistently outperforms strong baselines on multiple metrics, including Precision, Recall, F1-score, AUC-ROC, and MCC.
arXiv Detail & Related papers (2025-09-08T09:24:09Z) - PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs [88.98041407783502]
PRING is the first benchmark that evaluates protein-protein interaction prediction from a graph-level perspective.<n> PRING curates a high-quality, multi-species PPI network dataset comprising 21,484 proteins and 186,818 interactions.
arXiv Detail & Related papers (2025-07-07T15:21:05Z) - Learning Clustering-based Prototypes for Compositional Zero-shot Learning [56.57299428499455]
ClusPro is a robust clustering-based prototype mining framework for Compositional Zero-Shot Learning.<n>It defines the conceptual boundaries of primitives through a set of diversified prototypes.<n>ClusPro efficiently performs prototype clustering in a non-parametric fashion without the introduction of additional learnable parameters.
arXiv Detail & Related papers (2025-02-10T14:20:01Z) - ProFSA: Self-supervised Pocket Pretraining via Protein
Fragment-Surroundings Alignment [20.012210194899605]
We propose a novel pocket pretraining approach that leverages knowledge from high-resolution atomic protein structures.
Our method, named ProFSA, achieves state-of-the-art performance across various tasks, including pocket druggability prediction.
Our work opens up a new avenue for mitigating the scarcity of protein-ligand complex data through the utilization of high-quality and diverse protein structure databases.
arXiv Detail & Related papers (2023-10-11T06:36:23Z) - Geometric Deep Learning for Structure-Based Drug Design: A Survey [83.87489798671155]
Structure-based drug design (SBDD) leverages the three-dimensional geometry of proteins to identify potential drug candidates.
Recent advancements in geometric deep learning, which effectively integrate and process 3D geometric data, have significantly propelled the field forward.
arXiv Detail & Related papers (2023-06-20T14:21:58Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.