Improving AlphaFlow for Efficient Protein Ensembles Generation
- URL: http://arxiv.org/abs/2407.12053v1
- Date: Mon, 8 Jul 2024 13:36:43 GMT
- Title: Improving AlphaFlow for Efficient Protein Ensembles Generation
- Authors: Shaoning Li, Mingyu Li, Yusong Wang, Xinheng He, Nanning Zheng, Jian Zhang, Pheng-Ann Heng,
- Abstract summary: We propose a feature-conditioned generative model called AlphaFlow-Lit to realize efficient protein ensembles generation.
AlphaFlow-Lit performs on-par with AlphaFlow and surpasses its distilled version without pretraining, all while achieving a significant sampling acceleration of around 47 times.
- Score: 64.10918970280603
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Investigating conformational landscapes of proteins is a crucial way to understand their biological functions and properties. AlphaFlow stands out as a sequence-conditioned generative model that introduces flexibility into structure prediction models by fine-tuning AlphaFold under the flow-matching framework. Despite the advantages of efficient sampling afforded by flow-matching, AlphaFlow still requires multiple runs of AlphaFold to finally generate one single conformation. Due to the heavy consumption of AlphaFold, its applicability is limited in sampling larger set of protein ensembles or the longer chains within a constrained timeframe. In this work, we propose a feature-conditioned generative model called AlphaFlow-Lit to realize efficient protein ensembles generation. In contrast to the full fine-tuning on the entire structure, we focus solely on the light-weight structure module to reconstruct the conformation. AlphaFlow-Lit performs on-par with AlphaFlow and surpasses its distilled version without pretraining, all while achieving a significant sampling acceleration of around 47 times. The advancement in efficiency showcases the potential of AlphaFlow-Lit in enabling faster and more scalable generation of protein ensembles.
Related papers
- Structure Language Models for Protein Conformation Generation [66.42864253026053]
Traditional physics-based simulation methods often struggle with sampling equilibrium conformations.
Deep generative models have shown promise in generating protein conformations as a more efficient alternative.
We introduce Structure Language Modeling as a novel framework for efficient protein conformation generation.
arXiv Detail & Related papers (2024-10-24T03:38:51Z) - Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation [55.93511121486321]
We introduce FoldFlow-2, a novel sequence-conditioned flow matching model for protein structure generation.
We train FoldFlow-2 at scale on a new dataset that is an order of magnitude larger than PDB datasets of prior works.
We empirically observe that FoldFlow-2 outperforms previous state-of-the-art protein structure-based generative models.
arXiv Detail & Related papers (2024-05-30T17:53:50Z) - AlphaFold Meets Flow Matching for Generating Protein Ensembles [11.1639408863378]
We develop a flow-based generative modeling approach for learning and sampling the conformational landscapes of proteins.
Our method provides a superior combination of precision and diversity compared to AlphaFold with MSA subsampling.
Our method can diversify a static PDB structure with faster wall-clock convergence to certain equilibrium properties than replicate MD trajectories.
arXiv Detail & Related papers (2024-02-07T13:44:47Z) - SE(3)-Stochastic Flow Matching for Protein Backbone Generation [54.951832422425454]
We introduce FoldFlow, a series of novel generative models of increasing modeling power based on the flow-matching paradigm over $3mathrmD$ rigid motions.
Our family of FoldFlowgenerative models offers several advantages over previous approaches to the generative modeling of proteins.
arXiv Detail & Related papers (2023-10-03T19:24:24Z) - AlphaFold Distillation for Protein Design [25.190210443632825]
Inverse protein folding is crucial in bio-engineering and drug discovery.
Forward folding models like AlphaFold offer a potential solution by accurately predicting structures from sequences.
We propose using knowledge distillation on folding model confidence metrics to create a faster and end-to-end differentiable distilled model.
arXiv Detail & Related papers (2022-10-05T19:43:06Z) - Unsupervisedly Prompting AlphaFold2 for Few-Shot Learning of Accurate
Folding Landscape and Protein Structure Prediction [28.630603355510324]
We present EvoGen, a meta generative model, to remedy the underperformance of AlphaFold2 for poor MSA targets.
By prompting the model with calibrated or virtually generated homologue sequences, EvoGen helps AlphaFold2 fold accurately in low-data regime.
arXiv Detail & Related papers (2022-08-20T10:23:17Z) - HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein
Language Model as an Alternative [61.984700682903096]
HelixFold-Single is proposed to combine a large-scale protein language model with the superior geometric learning capability of AlphaFold2.
Our proposed method pre-trains a large-scale protein language model with thousands of millions of primary sequences.
We obtain an end-to-end differentiable model to predict the 3D coordinates of atoms from only the primary sequence.
arXiv Detail & Related papers (2022-07-28T07:30:33Z) - FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours [11.847436777986323]
We propose FastFold, a highly efficient implementation of the protein structure prediction model for training and inference.
FastFold includes a series of GPU optimizations based on a thorough analysis of AlphaFold's performance.
Experimental results show that FastFold reduces overall training time from 11 days to 67 hours and achieves 7.5-9.5X speedup for long-sequence inference.
arXiv Detail & Related papers (2022-03-02T03:59:51Z) - EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.