Decoupled Sequence and Structure Generation for Realistic Antibody Design
- URL: http://arxiv.org/abs/2402.05982v3
- Date: Fri, 17 Jan 2025 02:28:17 GMT
- Title: Decoupled Sequence and Structure Generation for Realistic Antibody Design
- Authors: Nayoung Kim, Minsu Kim, Sungsoo Ahn, Jinkyoo Park,
- Abstract summary: A dominant paradigm is to train a model to jointly generate the antibody sequence and the structure as a candidate.
We propose an antibody sequence-structure decoupling (ASSD) framework, which separates sequence generation and structure prediction.
ASSD shows improved performance in various antibody design experiments, while the composition-based objective successfully mitigates token repetition of non-autoregressive models.
- Score: 45.72237864940556
- License:
- Abstract: Recently, deep learning has made rapid progress in antibody design, which plays a key role in the advancement of therapeutics. A dominant paradigm is to train a model to jointly generate the antibody sequence and the structure as a candidate. However, the joint generation requires the model to generate both the discrete amino acid categories and the continuous 3D coordinates; this limits the space of possible architectures and may lead to suboptimal performance. In response, we propose an antibody sequence-structure decoupling (ASSD) framework, which separates sequence generation and structure prediction. Although our approach is simple, our idea allows the use of powerful neural architectures and demonstrates notable performance improvements. We also find that the widely used non-autoregressive generators promote sequences with overly repeating tokens. Such sequences are both out-of-distribution and prone to undesirable developability properties that can trigger harmful immune responses in patients. To resolve this, we introduce a composition-based objective that allows an efficient trade-off between high performance and low token repetition. ASSD shows improved performance in various antibody design experiments, while the composition-based objective successfully mitigates token repetition of non-autoregressive models.
Related papers
- Inverse folding for antibody sequence design using deep learning [2.8998926117101367]
We propose a fine-tuned folding inverse model that is specifically optimised for antibody structures.
We study the canonical conformations of complementarity-determining regions and find improved encoding of these loops into known clusters.
arXiv Detail & Related papers (2023-10-30T13:12:41Z) - A Hierarchical Training Paradigm for Antibody Structure-sequence
Co-design [54.30457372514873]
We propose a hierarchical training paradigm (HTP) for the antibody sequence-structure co-design.
HTP consists of four levels of training stages, each corresponding to a specific protein modality.
Empirical experiments show that HTP sets the new state-of-the-art performance in the co-design problem.
arXiv Detail & Related papers (2023-10-30T02:39:15Z) - Cross-Gate MLP with Protein Complex Invariant Embedding is A One-Shot
Antibody Designer [58.97153056120193]
The specificity of an antibody is determined by its complementarity-determining regions (CDRs)
Previous studies have utilized complex techniques to generate CDRs, but they suffer from inadequate geometric modeling.
We propose a textitsimple yet effective model that can co-design 1D sequences and 3D structures of CDRs in a one-shot manner.
arXiv Detail & Related papers (2023-04-21T13:24:26Z) - xTrimoABFold: De novo Antibody Structure Prediction without MSA [77.47606749555686]
We develop a novel model named xTrimoABFold to predict antibody structure from antibody sequence.
The model was trained end-to-end on the antibody structures in PDB by minimizing the ensemble loss of domain-specific focal loss on CDR and the frame-aligned point loss.
arXiv Detail & Related papers (2022-11-30T09:26:08Z) - Incorporating Pre-training Paradigm for Antibody Sequence-Structure
Co-design [134.65287929316673]
Deep learning-based computational antibody design has attracted popular attention since it automatically mines the antibody patterns from data that could be complementary to human experiences.
The computational methods heavily rely on high-quality antibody structure data, which is quite limited.
Fortunately, there exists a large amount of sequence data of antibodies that can help model the CDR and alleviate the reliance on structure data.
arXiv Detail & Related papers (2022-10-26T15:31:36Z) - Reprogramming Pretrained Language Models for Antibody Sequence Infilling [72.13295049594585]
Computational design of antibodies involves generating novel and diverse sequences, while maintaining structural consistency.
Recent deep learning models have shown impressive results, however the limited number of known antibody sequence/structure pairs frequently leads to degraded performance.
In our work we address this challenge by leveraging Model Reprogramming (MR), which repurposes pretrained models on a source language to adapt to the tasks that are in a different language and have scarce data.
arXiv Detail & Related papers (2022-10-05T20:44:55Z) - Benchmarking deep generative models for diverse antibody sequence design [18.515971640245997]
Deep generative models that learn from sequences alone or from sequences and structures jointly have shown impressive performance on this task.
We consider three recently proposed deep generative frameworks for protein design: (AR) the sequence-based autoregressive generative model, (GVP) the precise structure-based graph neural network, and Fold2Seq that leverages a fuzzy and scale-free representation of a three-dimensional fold.
We benchmark these models on the task of computational design of antibody sequences, which demand designing sequences with high diversity for functional implication.
arXiv Detail & Related papers (2021-11-12T16:23:32Z) - Iterative Refinement Graph Neural Network for Antibody
Sequence-Structure Co-design [35.215029426177004]
We propose a generative model to automatically design antibodies with enhanced binding specificity or neutralization capabilities.
Our method achieves superior log-likelihood on the test set and outperforms previous baselines in designing antibodies capable of neutralizing the SARS-CoV-2 virus.
arXiv Detail & Related papers (2021-10-09T18:23:32Z) - Self-Reflective Variational Autoencoder [21.054722609128525]
Variational Autoencoder (VAE) is a powerful framework for learning latent variable generative models.
We introduce a solution, which we call self-reflective inference.
We empirically demonstrate the clear advantages of matching the variational posterior to the exact posterior.
arXiv Detail & Related papers (2020-07-10T05:05:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.