AbRank: A Benchmark Dataset and Metric-Learning Framework for Antibody-Antigen Affinity Ranking
- URL: http://arxiv.org/abs/2506.17857v1
- Date: Sat, 21 Jun 2025 23:34:46 GMT
- Title: AbRank: A Benchmark Dataset and Metric-Learning Framework for Antibody-Antigen Affinity Ranking
- Authors: Chunan Liu, Aurelien Pelissier, Yanjun Shao, Lilian Denzler, Andrew C. R. Martin, Brooks Paige, Mariia Rodriguez Martinez,
- Abstract summary: AbRank is a large-scale benchmark and evaluation framework that reframes affinity prediction as a pairwise ranking problem.<n>We introduce WALLE-Affinity, a graph-based approach that integrates protein language model embeddings with structural information to predict pairwise binding preferences.
- Score: 3.6572710422983445
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Accurate prediction of antibody-antigen (Ab-Ag) binding affinity is essential for therapeutic design and vaccine development, yet the performance of current models is limited by noisy experimental labels, heterogeneous assay conditions, and poor generalization across the vast antibody and antigen sequence space. We introduce AbRank, a large-scale benchmark and evaluation framework that reframes affinity prediction as a pairwise ranking problem. AbRank aggregates over 380,000 binding assays from nine heterogeneous sources, spanning diverse antibodies, antigens, and experimental conditions, and introduces standardized data splits that systematically increase distribution shift, from local perturbations such as point mutations to broad generalization across novel antigens and antibodies. To ensure robust supervision, AbRank defines an m-confident ranking framework by filtering out comparisons with marginal affinity differences, focusing training on pairs with at least an m-fold difference in measured binding strength. As a baseline for the benchmark, we introduce WALLE-Affinity, a graph-based approach that integrates protein language model embeddings with structural information to predict pairwise binding preferences. Our benchmarks reveal significant limitations in current methods under realistic generalization settings and demonstrate that ranking-based training improves robustness and transferability. In summary, AbRank offers a robust foundation for machine learning models to generalize across the antibody-antigen space, with direct relevance for scalable, structure-aware antibody therapeutic design.
Related papers
- BURN: Backdoor Unlearning via Adversarial Boundary Analysis [73.14147934175604]
Backdoor unlearning aims to remove backdoor-related information while preserving the model's original functionality.<n>We propose Backdoor Unlearning via adversaRial bouNdary analysis (BURN), a novel defense framework that integrates false correlation decoupling, progressive data refinement, and model purification.
arXiv Detail & Related papers (2025-07-14T17:13:06Z) - Conformation-Aware Structure Prediction of Antigen-Recognizing Immune Proteins [4.747546562792329]
We introduce Ibex, a pan-immunoglobulin structure prediction model.<n>It achieves state-of-the-art accuracy in modeling the variable domains of antibodies, nanobodies, and T-cell receptors.
arXiv Detail & Related papers (2025-07-11T22:09:03Z) - Benchmark for Antibody Binding Affinity Maturation and Design [11.905797701155263]
AbBiBench is a benchmarking framework for antibody binding affinity maturation and design.<n>We first curate, standardize, and share 9 datasets containing 9 antigens.<n>We compare 14 protein models including masked language models, autoregressive language models, inverse folding models, diffusion-based generative models, and geometric graph models.
arXiv Detail & Related papers (2025-05-23T21:09:04Z) - Sequence-Only Prediction of Binding Affinity Changes: A Robust and Interpretable Model for Antibody Engineering [9.789817970737666]
A pivotal area of research in antibody engineering is to find effective modifications that enhance antibody-antigen binding affinity.<n>Deep learning solutions offer an alternative by modeling antibody structures to predict binding affinity changes.<n>We propose ProtAttBA, a deep learning model that predicts binding affinity changes based solely on the sequence information of antibody-antigen complexes.
arXiv Detail & Related papers (2025-05-14T15:00:46Z) - Relation-Aware Equivariant Graph Networks for Epitope-Unknown Antibody Design and Specificity Optimization [61.06622479173572]
We propose a novel Relation-Aware Design (RAAD) framework, which models antigen-antibody interactions for co-designing sequences and structures of antigen-specific CDRs.<n> Furthermore, we propose a new evaluation metric to better measure antibody specificity and develop a contrasting specificity-enhancing constraint to optimize the specificity of antibodies.
arXiv Detail & Related papers (2024-12-14T03:00:44Z) - Evaluating Zero-Shot Scoring for In Vitro Antibody Binding Prediction
with Experimental Validation [0.08968838300743379]
We compare 8 common scoring paradigms based on open-source models to classify antibody designs as binders or non-binders.
Results show that existing methods struggle to detect binders, and performance is highly variable across antigens.
arXiv Detail & Related papers (2023-12-07T23:34:03Z) - Forecast reconciliation for vaccine supply chain optimization [61.13962963550403]
Vaccine supply chain optimization can benefit from hierarchical time series forecasting.
Forecasts of different hierarchy levels become incoherent when higher levels do not match the sum of the lower levels forecasts.
We tackle the vaccine sale forecasting problem by modeling sales data from GSK between 2010 and 2021 as a hierarchical time series.
arXiv Detail & Related papers (2023-05-02T14:34:34Z) - xTrimoABFold: De novo Antibody Structure Prediction without MSA [77.47606749555686]
We develop a novel model named xTrimoABFold to predict antibody structure from antibody sequence.
The model was trained end-to-end on the antibody structures in PDB by minimizing the ensemble loss of domain-specific focal loss on CDR and the frame-aligned point loss.
arXiv Detail & Related papers (2022-11-30T09:26:08Z) - Incorporating Pre-training Paradigm for Antibody Sequence-Structure
Co-design [134.65287929316673]
Deep learning-based computational antibody design has attracted popular attention since it automatically mines the antibody patterns from data that could be complementary to human experiences.
The computational methods heavily rely on high-quality antibody structure data, which is quite limited.
Fortunately, there exists a large amount of sequence data of antibodies that can help model the CDR and alleviate the reliance on structure data.
arXiv Detail & Related papers (2022-10-26T15:31:36Z) - Reprogramming Pretrained Language Models for Antibody Sequence Infilling [72.13295049594585]
Computational design of antibodies involves generating novel and diverse sequences, while maintaining structural consistency.
Recent deep learning models have shown impressive results, however the limited number of known antibody sequence/structure pairs frequently leads to degraded performance.
In our work we address this challenge by leveraging Model Reprogramming (MR), which repurposes pretrained models on a source language to adapt to the tasks that are in a different language and have scarce data.
arXiv Detail & Related papers (2022-10-05T20:44:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.