Latent Retrieval Augmented Generation of Cross-Domain Protein Binders
- URL: http://arxiv.org/abs/2510.10480v2
- Date: Thu, 16 Oct 2025 11:55:27 GMT
- Title: Latent Retrieval Augmented Generation of Cross-Domain Protein Binders
- Authors: Zishen Zhang, Xiangzhe Kong, Wenbing Huang, Yang Liu,
- Abstract summary: We propose Retrieval-Augmented Diffusion for Aligned interface (RADiAnce) to guide the design of novel protein binders.<n>By unifying retrieval and generation in a shared contrastive latent space, our model efficiently identifies relevant interfaces for a given binding site.<n>Our work establishes a new paradigm for protein binder design that successfully bridges retrieval-based knowledge and generative AI.
- Score: 22.891733948881512
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Designing protein binders targeting specific sites, which requires to generate realistic and functional interaction patterns, is a fundamental challenge in drug discovery. Current structure-based generative models are limited in generating nterfaces with sufficient rationality and interpretability. In this paper, we propose Retrieval-Augmented Diffusion for Aligned interface (RADiAnce), a new framework that leverages known interfaces to guide the design of novel binders. By unifying retrieval and generation in a shared contrastive latent space, our model efficiently identifies relevant interfaces for a given binding site and seamlessly integrates them through a conditional latent diffusion generator, enabling cross-domain interface transfer. Extensive exeriments show that RADiAnce significantly outperforms baseline models across multiple metrics, including binding affinity and recovery of geometries and interactions. Additional experimental results validate cross-domain generalization, demonstrating that retrieving interfaces from diverse domains, such as peptides, antibodies, and protein fragments, enhances the generation performance of binders for other domains. Our work establishes a new paradigm for protein binder design that successfully bridges retrieval-based knowledge and generative AI, opening new possibilities for drug discovery.
Related papers
- AutoBinder Agent: An MCP-Based Agent for End-to-End Protein Binder Design [8.190052071911001]
We present an agentic end-to-end drug design framework that leverages a Large Language Model (LLM) and the Model Context Protocol (MCP)<n>The system integrates four state-of-the-art components: MaSIF for geometric deep learning-based identification of protein-site interaction sites, Rosetta for grafting protein fragments onto protein backbones, ProteinMPNN for amino acid sequences, and AlphaFold3 for near-protein accuracy in complex structure prediction.
arXiv Detail & Related papers (2026-01-16T08:57:03Z) - High-Fidelity Scientific Simulation Surrogates via Adaptive Implicit Neural Representations [51.90920900332569]
Implicit neural representations (INRs) offer a compact and continuous framework for modeling spatially structured data.<n>Recent approaches address this by introducing additional features along rigid geometric structures.<n>We propose a simple yet effective alternative: Feature-Adaptive INR (FA-INR)
arXiv Detail & Related papers (2025-06-07T16:45:17Z) - UniGenX: a unified generative foundation model that couples sequence, structure and function to accelerate scientific design across proteins, molecules and materials [62.72989417755985]
We present UniGenX, a unified generative model for function in natural systems.<n>UniGenX represents heterogeneous inputs as a mixed stream of symbolic and numeric tokens.<n>It achieves state-of-the-art or competitive performance for the function-aware generation across domains.
arXiv Detail & Related papers (2025-03-09T16:43:07Z) - Towards More Accurate Full-Atom Antibody Co-Design [44.06939390661133]
Co-design represents a critical frontier in drug development, where accurate prediction of structure of complementarity-determining regions is essential for targeting specific equivariants.<n>Despite recent advances in graph neural networks for antibody design, current approaches often fall short in capturing the intricate interactions that govern antibody-antigen recognition and binding specificity.<n>We present Igformer, a novel end-to-end framework that addresses these limitations through personalized antibody-antigen binding interfaces.
arXiv Detail & Related papers (2025-02-11T13:33:28Z) - Geometric-informed GFlowNets for Structure-Based Drug Design [4.8722087770556906]
We employ Generative Flow Networks (GFlowNets) to explore the vast space of drug-like molecules.
We introduce a novel modification to the GFlowNet framework by incorporating trigonometrically consistent embeddings.
Experiments conducted using CrossDocked 2020 demonstrated an improvement in the binding affinity between generated molecules and protein pockets.
arXiv Detail & Related papers (2024-06-16T09:32:19Z) - A Hierarchical Training Paradigm for Antibody Structure-sequence
Co-design [54.30457372514873]
We propose a hierarchical training paradigm (HTP) for the antibody sequence-structure co-design.
HTP consists of four levels of training stages, each corresponding to a specific protein modality.
Empirical experiments show that HTP sets the new state-of-the-art performance in the co-design problem.
arXiv Detail & Related papers (2023-10-30T02:39:15Z) - Target-aware Variational Auto-encoders for Ligand Generation with
Multimodal Protein Representation Learning [2.01243755755303]
We introduce TargetVAE, a target-aware auto-encoder that generates with high binding affinities to arbitrary protein targets.
This is the first effort to unify different representations of proteins into a single model that we name as Protein Multimodal Network (PMN)
arXiv Detail & Related papers (2023-08-02T12:08:17Z) - Geometric Deep Learning for Structure-Based Drug Design: A Survey [83.87489798671155]
Structure-based drug design (SBDD) leverages the three-dimensional geometry of proteins to identify potential drug candidates.
Recent advancements in geometric deep learning, which effectively integrate and process 3D geometric data, have significantly propelled the field forward.
arXiv Detail & Related papers (2023-06-20T14:21:58Z) - Semantic Correspondence with Transformers [68.37049687360705]
We propose Cost Aggregation with Transformers (CATs) to find dense correspondences between semantically similar images.
We include appearance affinity modelling to disambiguate the initial correlation maps and multi-level aggregation.
We conduct experiments to demonstrate the effectiveness of the proposed model over the latest methods and provide extensive ablation studies.
arXiv Detail & Related papers (2021-06-04T14:39:03Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.