Related papers: Leveraging Large Language Models to Predict Antibody Biological Activity Against Influenza A Hemagglutinin

Leveraging Large Language Models to Predict Antibody Biological Activity Against Influenza A Hemagglutinin

URL: http://arxiv.org/abs/2502.00694v1
Date: Sun, 02 Feb 2025 06:48:45 GMT
Title: Leveraging Large Language Models to Predict Antibody Biological Activity Against Influenza A Hemagglutinin
Authors: Ella Barkan, Ibrahim Siddiqui, Kevin J. Cheng, Alex Golts, Yoel Shoshan, Jeffrey K. Weber, Yailin Campos Mota, Michal Ozery-Flato, Giuseppe A. Sautto,
Abstract summary: We develop an AI model for predicting the binding and receptor blocking activity of antibodies against influenza A hemagglutininin (HA) antigens.<n>Our models achieved an AUROC $geq$ 0.91 for predicting the activity of existing antibodies against seen HAs and an AUROC of 0.9 for unseen HAs.
Score: 0.15547733154162566
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Monoclonal antibodies (mAbs) represent one of the most prevalent FDA-approved modalities for treating autoimmune diseases, infectious diseases, and cancers. However, discovery and development of therapeutic antibodies remains a time-consuming and expensive process. Recent advancements in machine learning (ML) and artificial intelligence (AI) have shown significant promise in revolutionizing antibody discovery and optimization. In particular, models that predict antibody biological activity enable in-silico evaluation of binding and functional properties; such models can prioritize antibodies with the highest likelihoods of success in costly and time-intensive laboratory testing procedures. We here explore an AI model for predicting the binding and receptor blocking activity of antibodies against influenza A hemagglutinin (HA) antigens. Our present model is developed with the MAMMAL framework for biologics discovery to predict antibody-antigen interactions using only sequence information. To evaluate the model's performance, we tested it under various data split conditions to mimic real-world scenarios. Our models achieved an AUROC $\geq$ 0.91 for predicting the activity of existing antibodies against seen HAs and an AUROC of 0.9 for unseen HAs. For novel antibody activity prediction, the AUROC was 0.73, which further declined to 0.63-0.66 under stringent constraints on similarity to existing antibodies. These results demonstrate the potential of AI foundation models to transform antibody design by reducing dependence on extensive laboratory testing and enabling more efficient prioritization of antibody candidates. Moreover, our findings emphasize the critical importance of diverse and comprehensive antibody datasets to improve the generalization of prediction models, particularly for novel antibody development.

Related papers

Benchmark for Antibody Binding Affinity Maturation and Design [11.905797701155263]
AbBiBench is a benchmarking framework for antibody binding affinity maturation and design.<n>We first curate, standardize, and share 9 datasets containing 9 antigens.<n>We compare 14 protein models including masked language models, autoregressive language models, inverse folding models, diffusion-based generative models, and geometric graph models.
arXiv Detail & Related papers (2025-05-23T21:09:04Z)
Llama-Affinity: A Predictive Antibody Antigen Binding Model Integrating Antibody Sequences with Llama3 Backbone Architecture [2.474908349649168]
We present an advanced antibody-antigen binding affinity prediction model (Llamafinity)<n>The model achieved an accuracy of 0.9640, an F1-score of 0.9643, a precision of 0.9702, a recall of 0.9586, and an AUC-ROC of 0.9936.<n>This strategy unveiled higher computational efficiency, with a five-fold average cumulative training time of only 0.46 hours.
arXiv Detail & Related papers (2025-05-17T20:10:54Z)
dyAb: Flow Matching for Flexible Antibody Design with AlphaFold-driven Pre-binding Antigen [52.809470467635194]
Development of therapeutic antibodies heavily relies on accurate predictions of how antigens will interact with antibodies. Existing computational methods in antibody design often overlook crucial conformational changes that antigens undergo during the binding process. We introduce dyAb, a flexible framework that incorporates AlphaFold2-driven predictions to model pre-binding antigen structures.
arXiv Detail & Related papers (2025-03-01T03:53:18Z)
Relation-Aware Equivariant Graph Networks for Epitope-Unknown Antibody Design and Specificity Optimization [61.06622479173572]
We propose a novel Relation-Aware Design (RAAD) framework, which models antigen-antibody interactions for co-designing sequences and structures of antigen-specific CDRs.<n> Furthermore, we propose a new evaluation metric to better measure antibody specificity and develop a contrasting specificity-enhancing constraint to optimize the specificity of antibodies.
arXiv Detail & Related papers (2024-12-14T03:00:44Z)
Generative Humanization for Therapeutic Antibodies [4.456125565993172]
Humanization is a sequence optimization strategy that addresses one critical risk called immunogenicity.<n>We re-frame humanization as a conditional generative modeling task, where humanizing mutations are sampled from a language model trained on human antibody data.<n>We describe a sampling process that incorporates models of therapeutic attributes, such as antigen binding affinity, to obtain candidate sequences that have both reduced immunogenicity risk and maintained or improved therapeutic properties.
arXiv Detail & Related papers (2024-12-06T02:54:45Z)
Antigen-Specific Antibody Design via Direct Energy-based Preference Optimization [51.28231365213679]
We tackle antigen-specific antibody sequence-structure co-design as an optimization problem towards specific preferences. We propose direct energy-based preference optimization to guide the generation of antibodies with both rational structures and considerable binding affinities to given antigens.
arXiv Detail & Related papers (2024-03-25T09:41:49Z)
AI driven B-cell Immunotherapy Design [0.0]
The effectiveness of antigen neutralisation and elimination hinges upon the strength, sensitivity, and specificity of the paratope-epitope interaction. In recent years, artificial intelligence and machine learning methods have made significant strides, revolutionising the prediction of protein structures and their complexes. This review focuses on the progress of machine learning-based tools and their frameworks in the domain of B-cell immunotherapy design.
arXiv Detail & Related papers (2023-09-03T09:14:10Z)
xTrimoABFold: De novo Antibody Structure Prediction without MSA [77.47606749555686]
We develop a novel model named xTrimoABFold to predict antibody structure from antibody sequence. The model was trained end-to-end on the antibody structures in PDB by minimizing the ensemble loss of domain-specific focal loss on CDR and the frame-aligned point loss.
arXiv Detail & Related papers (2022-11-30T09:26:08Z)
Incorporating Pre-training Paradigm for Antibody Sequence-Structure Co-design [134.65287929316673]
Deep learning-based computational antibody design has attracted popular attention since it automatically mines the antibody patterns from data that could be complementary to human experiences. The computational methods heavily rely on high-quality antibody structure data, which is quite limited. Fortunately, there exists a large amount of sequence data of antibodies that can help model the CDR and alleviate the reliance on structure data.
arXiv Detail & Related papers (2022-10-26T15:31:36Z)
Antibody Representation Learning for Drug Discovery [7.291511531280898]
We present results on a novel SARS-CoV-2 antibody binding dataset and an additional benchmark dataset. We compare three classes of models: conventional statistical sequence models, supervised learning on each dataset independently, and fine-tuning an antibody specific pre-trained language model. Experimental results suggest that self-supervised pretraining of feature representation consistently offers significant improvement in over previous approaches.
arXiv Detail & Related papers (2022-10-05T13:48:41Z)
AntBO: Towards Real-World Automated Antibody Design with Combinatorial Bayesian Optimisation [53.43922443725598]
We present AntBO: a Combinatorial optimisation algorithm enabling efficient in silico design of the CDRH3 region. To benchmark AntBO, we use the Absolut! software suite as a black-box oracle because it can score the target specificity and affinity of designed antibodies in silico. In under 200 protein designs, AntBO can suggest antibody sequences that outperform the best binding sequence drawn from 6.9 million experimentally obtained CDRH3s.
arXiv Detail & Related papers (2022-01-29T12:03:04Z)
Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics [109.70543391923344]
CLaSS (Controlled Latent attribute Space Sampling) is an efficient computational method for attribute-controlled generation of molecules. We screen the generated molecules for additional key attributes by using deep learning classifiers in conjunction with novel features derived from atomistic simulations. The proposed approach is demonstrated for designing non-toxic antimicrobial peptides (AMPs) with strong broad-spectrum potency.
arXiv Detail & Related papers (2020-05-22T15:57:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.