xTrimoABFold: De novo Antibody Structure Prediction without MSA
- URL: http://arxiv.org/abs/2212.00735v3
- Date: Fri, 5 May 2023 03:52:01 GMT
- Title: xTrimoABFold: De novo Antibody Structure Prediction without MSA
- Authors: Yining Wang, Xumeng Gong, Shaochuan Li, Bing Yang, YiWu Sun, Chuan
Shi, Yangang Wang, Cheng Yang, Hui Li, Le Song
- Abstract summary: We develop a novel model named xTrimoABFold to predict antibody structure from antibody sequence.
The model was trained end-to-end on the antibody structures in PDB by minimizing the ensemble loss of domain-specific focal loss on CDR and the frame-aligned point loss.
- Score: 77.47606749555686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the field of antibody engineering, an essential task is to design a novel
antibody whose paratopes bind to a specific antigen with correct epitopes.
Understanding antibody structure and its paratope can facilitate a mechanistic
understanding of its function. Therefore, antibody structure prediction from
its sequence alone has always been a highly valuable problem for de novo
antibody design. AlphaFold2, a breakthrough in the field of structural biology,
provides a solution to predict protein structure based on protein sequences and
computationally expensive coevolutionary multiple sequence alignments (MSAs).
However, the computational efficiency and undesirable prediction accuracy of
antibodies, especially on the complementarity-determining regions (CDRs) of
antibodies limit their applications in the industrially high-throughput drug
design. To learn an informative representation of antibodies, we employed a
deep antibody language model (ALM) on curated sequences from the observed
antibody space database via a transformer model. We also developed a novel
model named xTrimoABFold to predict antibody structure from antibody sequence
based on the pretrained ALM as well as efficient evoformers and structural
modules. The model was trained end-to-end on the antibody structures in PDB by
minimizing the ensemble loss of domain-specific focal loss on CDR and the
frame-aligned point loss. xTrimoABFold outperforms AlphaFold2 and other protein
language model based SOTAs, e.g., OmegaFold, HelixFold-Single, and IgFold with
a large significant margin (30+\% improvement on RMSD) while performing 151
times faster than AlphaFold2. To the best of our knowledge, xTrimoABFold
achieved state-of-the-art antibody structure prediction. Its improvement in
both accuracy and efficiency makes it a valuable tool for de novo antibody
design and could make further improvements in immuno-theory.
Related papers
- S$^2$ALM: Sequence-Structure Pre-trained Large Language Model for Comprehensive Antibody Representation Learning [8.059724314850799]
Antibodies safeguard our health through their precise and potent binding to specific antigens, demonstrating promising therapeutic efficacy in the treatment of numerous diseases, including COVID-19.
Recent advancements in biomedical language models have shown the great potential to interpret complex biological structures and functions.
This paper proposes Sequence-Structure multi-level pre-trained antibody Language Model (S$2$ALM), combining holistic sequential and structural information in one unified, generic antibody foundation model.
arXiv Detail & Related papers (2024-11-20T14:24:26Z) - Efficient Antibody Structure Refinement Using Energy-Guided SE(3) Flow Matching [16.192361788505558]
FlowAB is a novel antibody structure refinement method based on energy-guided flow matching.
It achieves new state-of-the-art performance on the antibody structure prediction task when used in conjunction with an appropriate prior model.
arXiv Detail & Related papers (2024-10-22T04:13:55Z) - Retrieval Augmented Diffusion Model for Structure-informed Antibody Design and Optimization [8.546688995090491]
Antibodies are essential proteins responsible for immune responses in organisms.
Recent advances in generative models have significantly enhanced rational antibody design.
We propose a retrieval-augmented diffusion framework, termed RADAb, for efficient antibody design.
arXiv Detail & Related papers (2024-10-19T08:53:01Z) - Antigen-Specific Antibody Design via Direct Energy-based Preference Optimization [51.28231365213679]
We tackle antigen-specific antibody sequence-structure co-design as an optimization problem towards specific preferences.
We propose direct energy-based preference optimization to guide the generation of antibodies with both rational structures and considerable binding affinities to given antigens.
arXiv Detail & Related papers (2024-03-25T09:41:49Z) - A Hierarchical Training Paradigm for Antibody Structure-sequence
Co-design [54.30457372514873]
We propose a hierarchical training paradigm (HTP) for the antibody sequence-structure co-design.
HTP consists of four levels of training stages, each corresponding to a specific protein modality.
Empirical experiments show that HTP sets the new state-of-the-art performance in the co-design problem.
arXiv Detail & Related papers (2023-10-30T02:39:15Z) - Incorporating Pre-training Paradigm for Antibody Sequence-Structure
Co-design [134.65287929316673]
Deep learning-based computational antibody design has attracted popular attention since it automatically mines the antibody patterns from data that could be complementary to human experiences.
The computational methods heavily rely on high-quality antibody structure data, which is quite limited.
Fortunately, there exists a large amount of sequence data of antibodies that can help model the CDR and alleviate the reliance on structure data.
arXiv Detail & Related papers (2022-10-26T15:31:36Z) - Reprogramming Pretrained Language Models for Antibody Sequence Infilling [72.13295049594585]
Computational design of antibodies involves generating novel and diverse sequences, while maintaining structural consistency.
Recent deep learning models have shown impressive results, however the limited number of known antibody sequence/structure pairs frequently leads to degraded performance.
In our work we address this challenge by leveraging Model Reprogramming (MR), which repurposes pretrained models on a source language to adapt to the tasks that are in a different language and have scarce data.
arXiv Detail & Related papers (2022-10-05T20:44:55Z) - AntBO: Towards Real-World Automated Antibody Design with Combinatorial
Bayesian Optimisation [53.43922443725598]
We present AntBO: a Combinatorial optimisation algorithm enabling efficient in silico design of the CDRH3 region.
To benchmark AntBO, we use the Absolut! software suite as a black-box oracle because it can score the target specificity and affinity of designed antibodies in silico.
In under 200 protein designs, AntBO can suggest antibody sequences that outperform the best binding sequence drawn from 6.9 million experimentally obtained CDRH3s.
arXiv Detail & Related papers (2022-01-29T12:03:04Z) - Iterative Refinement Graph Neural Network for Antibody
Sequence-Structure Co-design [35.215029426177004]
We propose a generative model to automatically design antibodies with enhanced binding specificity or neutralization capabilities.
Our method achieves superior log-likelihood on the test set and outperforms previous baselines in designing antibodies capable of neutralizing the SARS-CoV-2 virus.
arXiv Detail & Related papers (2021-10-09T18:23:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.