Technical Report of HelixFold3 for Biomolecular Structure Prediction
- URL: http://arxiv.org/abs/2408.16975v2
- Date: Mon, 9 Sep 2024 02:44:49 GMT
- Title: Technical Report of HelixFold3 for Biomolecular Structure Prediction
- Authors: Lihang Liu, Shanzhuo Zhang, Yang Xue, Xianbin Ye, Kunrui Zhu, Yuxin Li, Yang Liu, Wenlai Zhao, Hongkun Yu, Zhihua Wu, Xiaonan Zhang, Xiaomin Fang,
- Abstract summary: AlphaFold has transformed protein structure prediction with remarkable accuracy.
HelixFold3 aims to replicate AlphaFold3's capabilities.
The initial release of HelixFold3 is available as open source on GitHub.
- Score: 12.848589926054565
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The AlphaFold series has transformed protein structure prediction with remarkable accuracy, often matching experimental methods. AlphaFold2, AlphaFold-Multimer, and the latest AlphaFold3 represent significant strides in predicting single protein chains, protein complexes, and biomolecular structures. While AlphaFold2 and AlphaFold-Multimer are open-sourced, facilitating rapid and reliable predictions, AlphaFold3 remains partially accessible through a limited online server and has not been open-sourced, restricting further development. To address these challenges, the PaddleHelix team is developing HelixFold3, aiming to replicate AlphaFold3's capabilities. Using insights from previous models and extensive datasets, HelixFold3 achieves an accuracy comparable to AlphaFold3 in predicting the structures of conventional ligands, nucleic acids, and proteins. The initial release of HelixFold3 is available as open source on GitHub for academic research, promising to advance biomolecular research and accelerate discoveries. We also provide online service at PaddleHelix website at https://paddlehelix.baidu.com/app/all/helixfold3/forecast.
Related papers
- Improving AlphaFlow for Efficient Protein Ensembles Generation [64.10918970280603]
We propose a feature-conditioned generative model called AlphaFlow-Lit to realize efficient protein ensembles generation.
AlphaFlow-Lit performs on-par with AlphaFlow and surpasses its distilled version without pretraining, all while achieving a significant sampling acceleration of around 47 times.
arXiv Detail & Related papers (2024-07-08T13:36:43Z) - HelixFold-Multimer: Elevating Protein Complex Structure Prediction to New Heights [7.702856943171886]
We highlight the ongoing advancements of our protein complex structure prediction model, HelixFold-Multimer.
HelixFold-Multimer provides precise predictions for diverse protein complex structures, especially in therapeutic protein interactions.
HelixFold-Multimer is now available for public use on the PaddleHelix platform, offering both a general version and an antigen-antibody version.
arXiv Detail & Related papers (2024-04-16T03:29:37Z) - DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design [62.68420322996345]
Existing structured-based drug design methods treat all ligand atoms equally.
We propose a new diffusion model, DecompDiff, with decomposed priors over arms and scaffold.
Our approach achieves state-of-the-art performance in generating high-affinity molecules.
arXiv Detail & Related papers (2024-02-26T05:21:21Z) - xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering
the Language of Protein [76.18058946124111]
We propose a unified protein language model, xTrimoPGLM, to address protein understanding and generation tasks simultaneously.
xTrimoPGLM significantly outperforms other advanced baselines in 18 protein understanding benchmarks across four categories.
It can also generate de novo protein sequences following the principles of natural ones, and can perform programmable generation after supervised fine-tuning.
arXiv Detail & Related papers (2024-01-11T15:03:17Z) - An Equivariant Generative Framework for Molecular Graph-Structure
Co-Design [54.92529253182004]
We present MolCode, a machine learning-based generative framework for underlineMolecular graph-structure underlineCo-design.
In MolCode, 3D geometric information empowers the molecular 2D graph generation, which in turn helps guide the prediction of molecular 3D structure.
Our investigation reveals that the 2D topology and 3D geometry contain intrinsically complementary information in molecule design.
arXiv Detail & Related papers (2023-04-12T13:34:22Z) - Beating the Best: Improving on AlphaFold2 at Protein Structure
Prediction [1.3124513975412255]
ARStack significantly outperforms AlphaFold2 and RoseTTAFold.
We rigorously demonstrate this using two sets of non-homologous proteins, and a test set of protein structures published after that of AlphaFold2 and RoseTTAFold.
arXiv Detail & Related papers (2023-01-18T14:39:34Z) - Unsupervisedly Prompting AlphaFold2 for Few-Shot Learning of Accurate
Folding Landscape and Protein Structure Prediction [28.630603355510324]
We present EvoGen, a meta generative model, to remedy the underperformance of AlphaFold2 for poor MSA targets.
By prompting the model with calibrated or virtually generated homologue sequences, EvoGen helps AlphaFold2 fold accurately in low-data regime.
arXiv Detail & Related papers (2022-08-20T10:23:17Z) - HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein
Language Model as an Alternative [61.984700682903096]
HelixFold-Single is proposed to combine a large-scale protein language model with the superior geometric learning capability of AlphaFold2.
Our proposed method pre-trains a large-scale protein language model with thousands of millions of primary sequences.
We obtain an end-to-end differentiable model to predict the 3D coordinates of atoms from only the primary sequence.
arXiv Detail & Related papers (2022-07-28T07:30:33Z) - HelixFold: An Efficient Implementation of AlphaFold2 using PaddlePaddle [19.331098164638544]
We implement AlphaFold2 using PaddlePaddle, namely HelixFold, to improve training and inference speed and reduce memory consumption.
Compared with the original AlphaFold2 and OpenFold, HelixFold needs only 7.5 days to complete the full end-to-end training.
HelixFold's accuracy could be on par with AlphaFold2 on the CASP14 and CAMEO datasets.
arXiv Detail & Related papers (2022-07-12T11:43:50Z) - Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model
for Protein Design [70.27706384570723]
We propose Fold2Seq, a novel framework for designing protein sequences conditioned on a specific target fold.
We show improved or comparable performance of Fold2Seq in terms of speed, coverage, and reliability for sequence design.
The unique advantages of fold-based Fold2Seq, in comparison to a structure-based deep model and RosettaDesign, become more evident on three additional real-world challenges.
arXiv Detail & Related papers (2021-06-24T14:34:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.