Generative AI for Controllable Protein Sequence Design: A Survey
- URL: http://arxiv.org/abs/2402.10516v1
- Date: Fri, 16 Feb 2024 09:05:02 GMT
- Title: Generative AI for Controllable Protein Sequence Design: A Survey
- Authors: Yiheng Zhu, Zitai Kong, Jialu Wu, Weize Liu, Yuqiang Han, Mingze Yin,
Hongxia Xu, Chang-Yu Hsieh and Tingjun Hou
- Abstract summary: We systematically review recent advances in generative AI for controllable protein sequence design.
To set the stage, we first outline the foundational tasks in protein sequence design in terms of the constraints involved.
We then offer in-depth reviews of each design task and discuss the pertinent applications.
- Score: 2.3502958706414905
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The design of novel protein sequences with targeted functionalities underpins
a central theme in protein engineering, impacting diverse fields such as drug
discovery and enzymatic engineering. However, navigating this vast
combinatorial search space remains a severe challenge due to time and financial
constraints. This scenario is rapidly evolving as the transformative
advancements in AI, particularly in the realm of generative models and
optimization algorithms, have been propelling the protein design field towards
an unprecedented revolution. In this survey, we systematically review recent
advances in generative AI for controllable protein sequence design. To set the
stage, we first outline the foundational tasks in protein sequence design in
terms of the constraints involved and present key generative models and
optimization algorithms. We then offer in-depth reviews of each design task and
discuss the pertinent applications. Finally, we identify the unresolved
challenges and highlight research opportunities that merit deeper exploration.
Related papers
- Protein Design by Integrating Machine Learning with Quantum Annealing and Quantum-inspired Optimization [0.0]
The protein design problem involves finding polypeptide sequences folding into a given threedimensional structure.
Recent machine learning breakthroughs have enabled accurate and rapid structure predictions.
We introduce a general protein design scheme where algorithmic and technological advancements in machine learning and quantum-inspired algorithms can be integrated.
arXiv Detail & Related papers (2024-07-09T18:42:45Z) - Generative AI Agent for Next-Generation MIMO Design: Fundamentals, Challenges, and Vision [76.4345564864002]
Next-generation multiple input multiple output (MIMO) is expected to be intelligent and scalable.
We propose the concept of the generative AI agent, which is capable of generating tailored and specialized contents.
We present two compelling case studies that demonstrate the effectiveness of leveraging the generative AI agent for performance analysis.
arXiv Detail & Related papers (2024-04-13T02:39:36Z) - xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering
the Language of Protein [76.18058946124111]
We propose a unified protein language model, xTrimoPGLM, to address protein understanding and generation tasks simultaneously.
xTrimoPGLM significantly outperforms other advanced baselines in 18 protein understanding benchmarks across four categories.
It can also generate de novo protein sequences following the principles of natural ones, and can perform programmable generation after supervised fine-tuning.
arXiv Detail & Related papers (2024-01-11T15:03:17Z) - Efficiently Predicting Protein Stability Changes Upon Single-point
Mutation with Large Language Models [51.57843608615827]
The ability to precisely predict protein thermostability is pivotal for various subfields and applications in biochemistry.
We introduce an ESM-assisted efficient approach that integrates protein sequence and structural features to predict the thermostability changes in protein upon single-point mutations.
arXiv Detail & Related papers (2023-12-07T03:25:49Z) - A Hierarchical Training Paradigm for Antibody Structure-sequence
Co-design [54.30457372514873]
We propose a hierarchical training paradigm (HTP) for the antibody sequence-structure co-design.
HTP consists of four levels of training stages, each corresponding to a specific protein modality.
Empirical experiments show that HTP sets the new state-of-the-art performance in the co-design problem.
arXiv Detail & Related papers (2023-10-30T02:39:15Z) - Generative artificial intelligence for de novo protein design [1.2021565114959365]
Generative architectures seem adept at generating novel, yet realistic proteins.
Design protocols now achieve experimental success rates nearing 20%.
Despite extensive progress, there are clear field-wide challenges.
arXiv Detail & Related papers (2023-10-15T00:02:22Z) - Protein Sequence Design with Batch Bayesian Optimisation [0.0]
Protein sequence design is a challenging problem in protein engineering, which aims to discover novel proteins with useful biological functions.
directed evolution is a widely-used approach for protein sequence design, which mimics the evolution cycle in a laboratory environment and conducts an iterative protocol.
We propose a new method based on Batch Bayesian Optimization (Batch BO), a well-established optimization method, for protein sequence design.
arXiv Detail & Related papers (2023-03-18T14:53:20Z) - Structured Q-learning For Antibody Design [82.78798397798533]
One of the crucial steps involved in antibody design is to find an arrangement of amino acids in a protein sequence that improves its binding with a pathogen.
Combinatorial optimization of antibodies is difficult due to extremely large search spaces and non-linear objectives.
Applying traditional Reinforcement Learning to antibody design optimization results in poor performance.
We propose Q-learning, an extension of Q-learning that incorporates structural priors for optimization.
arXiv Detail & Related papers (2022-09-10T15:36:55Z) - ODBO: Bayesian Optimization with Search Space Prescreening for Directed Protein Evolution [18.726398852721204]
We propose an efficient, experimental design-oriented closed-loop optimization framework for protein directed evolution.
ODBO employs a combination of novel low-dimensional protein encoding strategy and Bayesian optimization enhanced with search space prescreening via outlier detection.
We conduct and report four protein directed evolution experiments that substantiate the capability of the proposed framework for finding variants with properties of interest.
arXiv Detail & Related papers (2022-05-19T13:21:31Z) - Protein sequence design with deep generative models [0.34410212782758054]
We highlight recent applications of machine learning to generate protein sequences, focusing on the emerging field of deep generative methods.
Protein engineering seeks to identify protein sequences with optimized properties. When guided by machine learning, protein sequence generation methods can draw on prior knowledge and experimental efforts to improve this process.
arXiv Detail & Related papers (2021-04-09T16:08:15Z) - AdaLead: A simple and robust adaptive greedy search algorithm for
sequence design [55.41644538483948]
We develop an easy-to-directed, scalable, and robust evolutionary greedy algorithm (AdaLead)
AdaLead is a remarkably strong benchmark that out-competes more complex state of the art approaches in a variety of biologically motivated sequence design challenges.
arXiv Detail & Related papers (2020-10-05T16:40:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.