Llama-Affinity: A Predictive Antibody Antigen Binding Model Integrating Antibody Sequences with Llama3 Backbone Architecture
- URL: http://arxiv.org/abs/2506.09052v1
- Date: Sat, 17 May 2025 20:10:54 GMT
- Title: Llama-Affinity: A Predictive Antibody Antigen Binding Model Integrating Antibody Sequences with Llama3 Backbone Architecture
- Authors: Delower Hossain, Ehsan Saghapour, Kevin Song, Jake Y. Chen,
- Abstract summary: We present an advanced antibody-antigen binding affinity prediction model (Llamafinity)<n>The model achieved an accuracy of 0.9640, an F1-score of 0.9643, a precision of 0.9702, a recall of 0.9586, and an AUC-ROC of 0.9936.<n>This strategy unveiled higher computational efficiency, with a five-fold average cumulative training time of only 0.46 hours.
- Score: 2.474908349649168
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Antibody-facilitated immune responses are central to the body's defense against pathogens, viruses, and other foreign invaders. The ability of antibodies to specifically bind and neutralize antigens is vital for maintaining immunity. Over the past few decades, bioengineering advancements have significantly accelerated therapeutic antibody development. These antibody-derived drugs have shown remarkable efficacy, particularly in treating cancer, SARS-CoV-2, autoimmune disorders, and infectious diseases. Traditionally, experimental methods for affinity measurement have been time-consuming and expensive. With the advent of artificial intelligence, in silico medicine has been revolutionized; recent developments in machine learning, particularly the use of large language models (LLMs) for representing antibodies, have opened up new avenues for AI-based design and improved affinity prediction. Herein, we present an advanced antibody-antigen binding affinity prediction model (LlamaAffinity), leveraging an open-source Llama 3 backbone and antibody sequence data sourced from the Observed Antibody Space (OAS) database. The proposed approach shows significant improvement over existing state-of-the-art (SOTA) methods (AntiFormer, AntiBERTa, AntiBERTy) across multiple evaluation metrics. Specifically, the model achieved an accuracy of 0.9640, an F1-score of 0.9643, a precision of 0.9702, a recall of 0.9586, and an AUC-ROC of 0.9936. Moreover, this strategy unveiled higher computational efficiency, with a five-fold average cumulative training time of only 0.46 hours, significantly lower than in previous studies.
Related papers
- Benchmark for Antibody Binding Affinity Maturation and Design [11.905797701155263]
AbBiBench is a benchmarking framework for antibody binding affinity maturation and design.<n>We first curate, standardize, and share 9 datasets containing 9 antigens.<n>We compare 14 protein models including masked language models, autoregressive language models, inverse folding models, diffusion-based generative models, and geometric graph models.
arXiv Detail & Related papers (2025-05-23T21:09:04Z) - dyAb: Flow Matching for Flexible Antibody Design with AlphaFold-driven Pre-binding Antigen [52.809470467635194]
Development of therapeutic antibodies heavily relies on accurate predictions of how antigens will interact with antibodies.<n>Existing computational methods in antibody design often overlook crucial conformational changes that antigens undergo during the binding process.<n>We introduce dyAb, a flexible framework that incorporates AlphaFold2-driven predictions to model pre-binding antigen structures.
arXiv Detail & Related papers (2025-03-01T03:53:18Z) - Leveraging Large Language Models to Predict Antibody Biological Activity Against Influenza A Hemagglutinin [0.15547733154162566]
We develop an AI model for predicting the binding and receptor blocking activity of antibodies against influenza A hemagglutininin (HA) antigens.<n>Our models achieved an AUROC $geq$ 0.91 for predicting the activity of existing antibodies against seen HAs and an AUROC of 0.9 for unseen HAs.
arXiv Detail & Related papers (2025-02-02T06:48:45Z) - Relation-Aware Equivariant Graph Networks for Epitope-Unknown Antibody Design and Specificity Optimization [61.06622479173572]
We propose a novel Relation-Aware Design (RAAD) framework, which models antigen-antibody interactions for co-designing sequences and structures of antigen-specific CDRs.<n> Furthermore, we propose a new evaluation metric to better measure antibody specificity and develop a contrasting specificity-enhancing constraint to optimize the specificity of antibodies.
arXiv Detail & Related papers (2024-12-14T03:00:44Z) - Large scale paired antibody language models [40.401345152825314]
We present IgBert and IgT5, the best performing antibody-specific language models developed to date.
These models are trained comprehensively using the more than two billion Observed Space dataset.
This advancement marks a significant leap forward in leveraging machine learning, large data sets and high-performance computing for enhancing antibody design for therapeutic development.
arXiv Detail & Related papers (2024-03-26T17:21:54Z) - xTrimoABFold: De novo Antibody Structure Prediction without MSA [77.47606749555686]
We develop a novel model named xTrimoABFold to predict antibody structure from antibody sequence.
The model was trained end-to-end on the antibody structures in PDB by minimizing the ensemble loss of domain-specific focal loss on CDR and the frame-aligned point loss.
arXiv Detail & Related papers (2022-11-30T09:26:08Z) - Incorporating Pre-training Paradigm for Antibody Sequence-Structure
Co-design [134.65287929316673]
Deep learning-based computational antibody design has attracted popular attention since it automatically mines the antibody patterns from data that could be complementary to human experiences.
The computational methods heavily rely on high-quality antibody structure data, which is quite limited.
Fortunately, there exists a large amount of sequence data of antibodies that can help model the CDR and alleviate the reliance on structure data.
arXiv Detail & Related papers (2022-10-26T15:31:36Z) - Antibody Representation Learning for Drug Discovery [7.291511531280898]
We present results on a novel SARS-CoV-2 antibody binding dataset and an additional benchmark dataset.
We compare three classes of models: conventional statistical sequence models, supervised learning on each dataset independently, and fine-tuning an antibody specific pre-trained language model.
Experimental results suggest that self-supervised pretraining of feature representation consistently offers significant improvement in over previous approaches.
arXiv Detail & Related papers (2022-10-05T13:48:41Z) - AntBO: Towards Real-World Automated Antibody Design with Combinatorial
Bayesian Optimisation [53.43922443725598]
We present AntBO: a Combinatorial optimisation algorithm enabling efficient in silico design of the CDRH3 region.
To benchmark AntBO, we use the Absolut! software suite as a black-box oracle because it can score the target specificity and affinity of designed antibodies in silico.
In under 200 protein designs, AntBO can suggest antibody sequences that outperform the best binding sequence drawn from 6.9 million experimentally obtained CDRH3s.
arXiv Detail & Related papers (2022-01-29T12:03:04Z) - Accelerating Antimicrobial Discovery with Controllable Deep Generative
Models and Molecular Dynamics [109.70543391923344]
CLaSS (Controlled Latent attribute Space Sampling) is an efficient computational method for attribute-controlled generation of molecules.
We screen the generated molecules for additional key attributes by using deep learning classifiers in conjunction with novel features derived from atomistic simulations.
The proposed approach is demonstrated for designing non-toxic antimicrobial peptides (AMPs) with strong broad-spectrum potency.
arXiv Detail & Related papers (2020-05-22T15:57:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.