Related papers: HelixFold: An Efficient Implementation of AlphaFold2 using PaddlePaddle

HelixFold: An Efficient Implementation of AlphaFold2 using PaddlePaddle

URL: http://arxiv.org/abs/2207.05477v2
Date: Wed, 13 Jul 2022 04:58:26 GMT
Title: HelixFold: An Efficient Implementation of AlphaFold2 using PaddlePaddle
Authors: Guoxia Wang, Xiaomin Fang, Zhihua Wu, Yiqun Liu, Yang Xue, Yingfei Xiang, Dianhai Yu, Fan Wang, Yanjun Ma
Abstract summary: We implement AlphaFold2 using PaddlePaddle, namely HelixFold, to improve training and inference speed and reduce memory consumption. Compared with the original AlphaFold2 and OpenFold, HelixFold needs only 7.5 days to complete the full end-to-end training. HelixFold's accuracy could be on par with AlphaFold2 on the CASP14 and CAMEO datasets.
Score: 19.331098164638544
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate protein structure prediction can significantly accelerate the development of life science. The accuracy of AlphaFold2, a frontier end-to-end structure prediction system, is already close to that of the experimental determination techniques. Due to the complex model architecture and large memory consumption, it requires lots of computational resources and time to implement the training and inference of AlphaFold2 from scratch. The cost of running the original AlphaFold2 is expensive for most individuals and institutions. Therefore, reducing this cost could accelerate the development of life science. We implement AlphaFold2 using PaddlePaddle, namely HelixFold, to improve training and inference speed and reduce memory consumption. The performance is improved by operator fusion, tensor fusion, and hybrid parallelism computation, while the memory is optimized through Recompute, BFloat16, and memory read/write in-place. Compared with the original AlphaFold2 (implemented with Jax) and OpenFold (implemented with PyTorch), HelixFold needs only 7.5 days to complete the full end-to-end training and only 5.3 days when using hybrid parallelism, while both AlphaFold2 and OpenFold take about 11 days. HelixFold saves 1x training time. We verified that HelixFold's accuracy could be on par with AlphaFold2 on the CASP14 and CAMEO datasets. HelixFold's code is available on GitHub for free download: https://github.com/PaddlePaddle/PaddleHelix/tree/dev/apps/protein_folding/helixfold, and we also provide stable web services on https://paddlehelix.baidu.com/app/drug/protein/forecast.

Related papers

Technical Report of HelixFold3 for Biomolecular Structure Prediction [12.848589926054565]
AlphaFold has transformed protein structure prediction with remarkable accuracy. HelixFold3 aims to replicate AlphaFold3's capabilities. The initial release of HelixFold3 is available as open source on GitHub.
arXiv Detail & Related papers (2024-08-30T02:36:36Z)
Improving AlphaFlow for Efficient Protein Ensembles Generation [64.10918970280603]
We propose a feature-conditioned generative model called AlphaFlow-Lit to realize efficient protein ensembles generation. AlphaFlow-Lit performs on-par with AlphaFlow and surpasses its distilled version without pretraining, all while achieving a significant sampling acceleration of around 47 times.
arXiv Detail & Related papers (2024-07-08T13:36:43Z)
ScaleFold: Reducing AlphaFold Initial Training Time to 10 Hours [4.886207598730398]
We conduct a comprehensive analysis on the AlphaFold training procedure based on Openfold. We identify that inefficient communications and overhead-dominated computations were the key factors that prevented AlphaFold from effective scaling. We introduce ScaleFold, a systematic training method that incorporated optimizations specifically for these factors.
arXiv Detail & Related papers (2024-04-17T04:55:33Z)
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores [18.016204763652553]
Convolution models with long filters have demonstrated state-of-the-art reasoning abilities in many long-sequence tasks. Fast Fourier Transform (FFT) allows long convolutions to run in $O(N logN)$ time in sequence length $N$ but has poor hardware utilization. In this paper, we study how to optimize the FFT convolution.
arXiv Detail & Related papers (2023-11-10T07:33:35Z)
HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative [61.984700682903096]
HelixFold-Single is proposed to combine a large-scale protein language model with the superior geometric learning capability of AlphaFold2. Our proposed method pre-trains a large-scale protein language model with thousands of millions of primary sequences. We obtain an end-to-end differentiable model to predict the 3D coordinates of atoms from only the primary sequence.
arXiv Detail & Related papers (2022-07-28T07:30:33Z)
Stochastic Gradient Descent without Full Data Shuffle [65.97105896033815]
CorgiPile is a hierarchical data shuffling strategy that avoids a full data shuffle while maintaining comparable convergence rate of SGD as if a full shuffle were performed. Our results show that CorgiPile can achieve comparable convergence rate with the full shuffle based SGD for both deep learning and generalized linear models.
arXiv Detail & Related papers (2022-06-12T20:04:31Z)
FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours [11.847436777986323]
We propose FastFold, a highly efficient implementation of the protein structure prediction model for training and inference. FastFold includes a series of GPU optimizations based on a thorough analysis of AlphaFold's performance. Experimental results show that FastFold reduces overall training time from 11 days to 67 hours and achieves 7.5-9.5X speedup for long-sequence inference.
arXiv Detail & Related papers (2022-03-02T03:59:51Z)
FNAS: Uncertainty-Aware Fast Neural Architecture Search [54.49650267859032]
Reinforcement learning (RL)-based neural architecture search (NAS) generally guarantees better convergence yet suffers from the requirement of huge computational resources. We propose a general pipeline to accelerate the convergence of the rollout process as well as the RL process in NAS. Experiments on the Mobile Neural Architecture Search (MNAS) search space show the proposed Fast Neural Architecture Search (FNAS) accelerates standard RL-based NAS process by 10x.
arXiv Detail & Related papers (2021-05-25T06:32:52Z)
Efficient and Generic 1D Dilated Convolution Layer for Deep Learning [52.899995651639436]
We introduce our efficient implementation of a generic 1D convolution layer covering a wide range of parameters. It is optimized for x86 CPU architectures, in particular, for architectures containing Intel AVX-512 and AVX-512 BFloat16 instructions. We demonstrate the performance of our optimized 1D convolution layer by utilizing it in the end-to-end neural network training with real genomics datasets.
arXiv Detail & Related papers (2021-04-16T09:54:30Z)
XSepConv: Extremely Separated Convolution [60.90871656244126]
We propose a novel extremely separated convolutional block (XSepConv) It fuses spatially separable convolutions into depthwise convolution to reduce both the computational cost and parameter size of large kernels. XSepConv is designed to be an efficient alternative to vanilla depthwise convolution with large kernel sizes.
arXiv Detail & Related papers (2020-02-27T11:46:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.