Related papers: EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based Models

EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based Models

URL: http://arxiv.org/abs/2105.04771v1
Date: Tue, 11 May 2021 03:40:29 GMT
Title: EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based Models
Authors: Jiaxiang Wu, Shitong Luo, Tao Shen, Haidong Lan, Sheng Wang, Junzhou Huang
Abstract summary: We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network. Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
Score: 53.17320541056843
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate protein structure prediction from amino-acid sequences is critical to better understanding the protein function. Recent advances in this area largely benefit from more precise inter-residue distance and orientation predictions, powered by deep neural networks. However, the structure optimization procedure is still dominated by traditional tools, e.g. Rosetta, where the structure is solved via minimizing a pre-defined statistical energy function (with optional prediction-based restraints). Such energy function may not be optimal in formulating the whole conformation space of proteins. In this paper, we propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network. This network is trained in a denoising manner, attempting to predict the correction signal from corrupted distance matrices between Ca atoms. Once the network is well trained, Langevin dynamics based sampling is adopted to gradually optimize structures from random initialization. Extensive experiments demonstrate that our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.

Related papers

Quantum Algorithm for Protein Side-Chain Optimisation: Comparing Quantum to Classical Methods [0.0]
We develop a resource-efficient optimisation algorithm to compute the ground state energy of protein structures.<n>We propose a quantum algorithm based on the Quantum Approximate optimisation algorithm to explore the conformational space and identify low-energy configurations.
arXiv Detail & Related papers (2025-07-25T15:37:04Z)
AMix-1: A Pathway to Test-Time Scalable Protein Foundation Model [92.51919604882984]
We introduce AMix-1, a powerful protein foundation model built on Flow Bayesian Networks.<n>AMix-1 is empowered by a systematic training methodology, encompassing pretraining scaling laws, emergent capability analysis, in-context learning mechanism, and test-time scaling algorithm.<n>Building on this foundation, we devise a multiple sequence alignment (MSA)-based in-context learning strategy to unify protein design into a general framework.
arXiv Detail & Related papers (2025-07-11T17:02:25Z)
MSNGO: multi-species protein function annotation based on 3D protein structure and network propagation [38.732449945780246]
We propose the MSNGO model, which integrates structural features and network propagation methods. Our validation shows that using structural features can significantly improve the accuracy of multi-species protein function prediction.
arXiv Detail & Related papers (2025-03-29T08:35:45Z)
Structure Language Models for Protein Conformation Generation [66.42864253026053]
Traditional physics-based simulation methods often struggle with sampling equilibrium conformations. Deep generative models have shown promise in generating protein conformations as a more efficient alternative. We introduce Structure Language Modeling as a novel framework for efficient protein conformation generation.
arXiv Detail & Related papers (2024-10-24T03:38:51Z)
Endowing Protein Language Models with Structural Knowledge [5.587293092389789]
We introduce a novel framework that enhances protein language models by integrating protein structural data. The refined model, termed Protein Structure Transformer (PST), is further pretrained on a small protein structure database. PST consistently outperforms the state-of-the-art foundation model for protein sequences, ESM-2, setting a new benchmark in protein function prediction.
arXiv Detail & Related papers (2024-01-26T12:47:54Z)
Functional Graphical Models: Structure Enables Offline Data-Driven Optimization [111.28605744661638]
We show how structure can enable sample-efficient data-driven optimization. We also present a data-driven optimization algorithm that infers the FGM structure itself.
arXiv Detail & Related papers (2024-01-08T22:33:14Z)
DeepGATGO: A Hierarchical Pretraining-Based Graph-Attention Model for Automatic Protein Function Prediction [4.608328575930055]
Automatic protein function prediction (AFP) is classified as a large-scale multi-label classification problem. Currently, popular methods primarily combine protein-related information and Gene Ontology (GO) terms to generate final functional predictions. We propose a sequence-based hierarchical prediction method, DeepGATGO, which processes protein sequences and GO term labels hierarchically.
arXiv Detail & Related papers (2023-07-24T07:01:32Z)
Predicting protein variants with equivariant graph neural networks [0.0]
We compare the abilities of equivariant graph neural networks (EGNNs) and sequence-based approaches to identify promising amino-acid mutations. Our proposed structural approach achieves a competitive performance to sequence-based approaches while being trained on significantly fewer molecules.
arXiv Detail & Related papers (2023-06-21T12:44:52Z)
Structure-informed Language Models Are Protein Designers [69.70134899296912]
We present LM-Design, a generic approach to reprogramming sequence-based protein language models (pLMs) We conduct a structural surgery on pLMs, where a lightweight structural adapter is implanted into pLMs and endows it with structural awareness. Experiments show that our approach outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2023-02-03T10:49:52Z)
Learning Geometrically Disentangled Representations of Protein Folding Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein. Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules. Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z)
ODBO: Bayesian Optimization with Search Space Prescreening for Directed Protein Evolution [18.726398852721204]
We propose an efficient, experimental design-oriented closed-loop optimization framework for protein directed evolution. ODBO employs a combination of novel low-dimensional protein encoding strategy and Bayesian optimization enhanced with search space prescreening via outlier detection. We conduct and report four protein directed evolution experiments that substantiate the capability of the proposed framework for finding variants with properties of interest.
arXiv Detail & Related papers (2022-05-19T13:21:31Z)
Transfer Learning for Protein Structure Classification at Low Resolution [124.5573289131546]
We show that it is possible to make accurate ($geq$80%) predictions of protein class and architecture from structures determined at low ($leq$3A) resolution. We provide proof of concept for high-speed, low-cost protein structure classification at low resolution, and a basis for extension to prediction of function.
arXiv Detail & Related papers (2020-08-11T15:01:32Z)
Fast differentiable DNA and protein sequence optimization for molecular design [0.0]
Machine learning models that accurately predict biological fitness from sequence are becoming a powerful tool for molecular design. Here, we build on a previously proposed straight-through approximation method to optimize through discrete sequence samples. The resulting algorithm, which we call Fast SeqPropProp, achieves up to 100-fold faster convergence compared to previous versions.
arXiv Detail & Related papers (2020-05-22T17:03:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.