EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
  Models
        - URL: http://arxiv.org/abs/2105.04771v1
 - Date: Tue, 11 May 2021 03:40:29 GMT
 - Title: EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
  Models
 - Authors: Jiaxiang Wu, Shitong Luo, Tao Shen, Haidong Lan, Sheng Wang, Junzhou
  Huang
 - Abstract summary: We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
 - Score: 53.17320541056843
 - License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
 - Abstract:   Accurate protein structure prediction from amino-acid sequences is critical
to better understanding the protein function. Recent advances in this area
largely benefit from more precise inter-residue distance and orientation
predictions, powered by deep neural networks. However, the structure
optimization procedure is still dominated by traditional tools, e.g. Rosetta,
where the structure is solved via minimizing a pre-defined statistical energy
function (with optional prediction-based restraints). Such energy function may
not be optimal in formulating the whole conformation space of proteins. In this
paper, we propose a fully-differentiable approach for protein structure
optimization, guided by a data-driven generative network. This network is
trained in a denoising manner, attempting to predict the correction signal from
corrupted distance matrices between Ca atoms. Once the network is well trained,
Langevin dynamics based sampling is adopted to gradually optimize structures
from random initialization. Extensive experiments demonstrate that our EBM-Fold
approach can efficiently produce high-quality decoys, compared against
traditional Rosetta-based structure optimization routines.
 
       
      
        Related papers
        - Quantum Algorithm for Protein Side-Chain Optimisation: Comparing Quantum   to Classical Methods [0.0]
We develop a resource-efficient optimisation algorithm to compute the ground state energy of protein structures.<n>We propose a quantum algorithm based on the Quantum Approximate optimisation algorithm to explore the conformational space and identify low-energy configurations.
arXiv  Detail & Related papers  (2025-07-25T15:37:04Z) - AMix-1: A Pathway to Test-Time Scalable Protein Foundation Model [92.51919604882984]
We introduce AMix-1, a powerful protein foundation model built on Flow Bayesian Networks.<n>AMix-1 is empowered by a systematic training methodology, encompassing pretraining scaling laws, emergent capability analysis, in-context learning mechanism, and test-time scaling algorithm.<n>Building on this foundation, we devise a multiple sequence alignment (MSA)-based in-context learning strategy to unify protein design into a general framework.
arXiv  Detail & Related papers  (2025-07-11T17:02:25Z) - MSNGO: multi-species protein function annotation based on 3D protein   structure and network propagation [38.732449945780246]
We propose the MSNGO model, which integrates structural features and network propagation methods.
Our validation shows that using structural features can significantly improve the accuracy of multi-species protein function prediction.
arXiv  Detail & Related papers  (2025-03-29T08:35:45Z) - Structure Language Models for Protein Conformation Generation [66.42864253026053]
Traditional physics-based simulation methods often struggle with sampling equilibrium conformations.
Deep generative models have shown promise in generating protein conformations as a more efficient alternative.
We introduce Structure Language Modeling as a novel framework for efficient protein conformation generation.
arXiv  Detail & Related papers  (2024-10-24T03:38:51Z) - Endowing Protein Language Models with Structural Knowledge [5.587293092389789]
We introduce a novel framework that enhances protein language models by integrating protein structural data.
The refined model, termed Protein Structure Transformer (PST), is further pretrained on a small protein structure database.
PST consistently outperforms the state-of-the-art foundation model for protein sequences, ESM-2, setting a new benchmark in protein function prediction.
arXiv  Detail & Related papers  (2024-01-26T12:47:54Z) - Functional Graphical Models: Structure Enables Offline Data-Driven   Optimization [111.28605744661638]
We show how structure can enable sample-efficient data-driven optimization.
We also present a data-driven optimization algorithm that infers the FGM structure itself.
arXiv  Detail & Related papers  (2024-01-08T22:33:14Z) - DeepGATGO: A Hierarchical Pretraining-Based Graph-Attention Model for
  Automatic Protein Function Prediction [4.608328575930055]
Automatic protein function prediction (AFP) is classified as a large-scale multi-label classification problem.
Currently, popular methods primarily combine protein-related information and Gene Ontology (GO) terms to generate final functional predictions.
We propose a sequence-based hierarchical prediction method, DeepGATGO, which processes protein sequences and GO term labels hierarchically.
arXiv  Detail & Related papers  (2023-07-24T07:01:32Z) - Predicting protein variants with equivariant graph neural networks [0.0]
We compare the abilities of equivariant graph neural networks (EGNNs) and sequence-based approaches to identify promising amino-acid mutations.
Our proposed structural approach achieves a competitive performance to sequence-based approaches while being trained on significantly fewer molecules.
arXiv  Detail & Related papers  (2023-06-21T12:44:52Z) - Structure-informed Language Models Are Protein Designers [69.70134899296912]
We present LM-Design, a generic approach to reprogramming sequence-based protein language models (pLMs)
We conduct a structural surgery on pLMs, where a lightweight structural adapter is implanted into pLMs and endows it with structural awareness.
Experiments show that our approach outperforms the state-of-the-art methods by a large margin.
arXiv  Detail & Related papers  (2023-02-03T10:49:52Z) - Learning Geometrically Disentangled Representations of Protein Folding
  Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv  Detail & Related papers  (2022-05-20T19:38:00Z) - ODBO: Bayesian Optimization with Search Space Prescreening for Directed   Protein Evolution [18.726398852721204]
We propose an efficient, experimental design-oriented closed-loop optimization framework for protein directed evolution.
 ODBO employs a combination of novel low-dimensional protein encoding strategy and Bayesian optimization enhanced with search space prescreening via outlier detection.
We conduct and report four protein directed evolution experiments that substantiate the capability of the proposed framework for finding variants with properties of interest.
arXiv  Detail & Related papers  (2022-05-19T13:21:31Z) - Transfer Learning for Protein Structure Classification at Low Resolution [124.5573289131546]
We show that it is possible to make accurate ($geq$80%) predictions of protein class and architecture from structures determined at low ($leq$3A) resolution.
We provide proof of concept for high-speed, low-cost protein structure classification at low resolution, and a basis for extension to prediction of function.
arXiv  Detail & Related papers  (2020-08-11T15:01:32Z) - Fast differentiable DNA and protein sequence optimization for molecular
  design [0.0]
Machine learning models that accurately predict biological fitness from sequence are becoming a powerful tool for molecular design.
Here, we build on a previously proposed straight-through approximation method to optimize through discrete sequence samples.
The resulting algorithm, which we call Fast SeqPropProp, achieves up to 100-fold faster convergence compared to previous versions.
arXiv  Detail & Related papers  (2020-05-22T17:03:55Z) 
        This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.