Lattice Protein Folding with Variational Annealing
- URL: http://arxiv.org/abs/2502.20632v1
- Date: Fri, 28 Feb 2025 01:30:15 GMT
- Title: Lattice Protein Folding with Variational Annealing
- Authors: Shoummo Ahsan Khandoker, Estelle M. Inack, Mohamed Hibat-Allah,
- Abstract summary: We introduce a novel training scheme that employs masking to identify the lowest-energy folds in two-dimensional Hydrophobic-Polar (HP) lattice protein folding.<n>Our findings emphasize the potential of advanced machine learning techniques in tackling complex protein folding problems.
- Score: 2.164205569823082
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding the principles of protein folding is a cornerstone of computational biology, with implications for drug design, bioengineering, and the understanding of fundamental biological processes. Lattice protein folding models offer a simplified yet powerful framework for studying the complexities of protein folding, enabling the exploration of energetically optimal folds under constrained conditions. However, finding these optimal folds is a computationally challenging combinatorial optimization problem. In this work, we introduce a novel upper-bound training scheme that employs masking to identify the lowest-energy folds in two-dimensional Hydrophobic-Polar (HP) lattice protein folding. By leveraging Dilated Recurrent Neural Networks (RNNs) integrated with an annealing process driven by temperature-like fluctuations, our method accurately predicts optimal folds for benchmark systems of up to 60 beads. Our approach also effectively masks invalid folds from being sampled without compromising the autoregressive sampling properties of RNNs. This scheme is generalizable to three spatial dimensions and can be extended to lattice protein models with larger alphabets. Our findings emphasize the potential of advanced machine learning techniques in tackling complex protein folding problems and a broader class of constrained combinatorial optimization challenges.
Related papers
- A Novel P-bit-based Probabilistic Computing Approach for Solving the 3-D Protein Folding Problem [4.410469529030158]
This study marks the first work to apply probabilistic computing to tackle protein folding.<n>We introduce a novel many-body interaction-based encoding method to map the problem onto an Ising model.<n>Our simulations show that this approach significantly simplifies the energy landscape for short peptide sequences of six amino acids.
arXiv Detail & Related papers (2025-02-27T12:46:25Z) - Protein Conformation Generation via Force-Guided SE(3) Diffusion Models [48.48934625235448]
Deep generative modeling techniques have been employed to generate novel protein conformations.
We propose a force-guided SE(3) diffusion model, ConfDiff, for protein conformation generation.
arXiv Detail & Related papers (2024-03-21T02:44:08Z) - xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein [74.64101864289572]
We propose a unified protein language model, xTrimoPGLM, to address protein understanding and generation tasks simultaneously.<n>xTrimoPGLM significantly outperforms other advanced baselines in 18 protein understanding benchmarks across four categories.<n>It can also generate de novo protein sequences following the principles of natural ones, and can perform programmable generation after supervised fine-tuning.
arXiv Detail & Related papers (2024-01-11T15:03:17Z) - An approach to solve the coarse-grained Protein folding problem in a
Quantum Computer [0.0]
Understanding protein structures and enzymes plays a critical role in target based drug designing, elucidating protein-related disease mechanisms, and innovating novel enzymes.
Recent advancements in AI based protein structure prediction methods have solved the protein folding problem to an extent, but their precision in determining the structure of the protein with low sequence similarity is limited.
In this work we developed a novel turn based encoding algorithm that can be run on a gate based quantum computer for predicting the structure of smaller protein sequences.
arXiv Detail & Related papers (2023-11-23T18:20:05Z) - Structure-informed Language Models Are Protein Designers [69.70134899296912]
We present LM-Design, a generic approach to reprogramming sequence-based protein language models (pLMs)
We conduct a structural surgery on pLMs, where a lightweight structural adapter is implanted into pLMs and endows it with structural awareness.
Experiments show that our approach outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2023-02-03T10:49:52Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - Peptide conformational sampling using the Quantum Approximate
Optimization Algorithm [0.03499870393443267]
We numerically investigate the performance of a variational quantum algorithm in sampling low-energy conformations of short peptides.
Results cast serious doubt on the ability of QAOA to address the protein folding problem in the near term.
arXiv Detail & Related papers (2022-04-04T20:09:50Z) - EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.