Efficient Protein Optimization via Structure-aware Hamiltonian Dynamics
- URL: http://arxiv.org/abs/2601.11012v1
- Date: Fri, 16 Jan 2026 05:53:53 GMT
- Title: Efficient Protein Optimization via Structure-aware Hamiltonian Dynamics
- Authors: Jiahao Wang, Shuangjia Zheng,
- Abstract summary: HADES is a Bayesian optimization method utilizing Hamiltonian dynamics to efficiently sample from a structure-aware approximated posterior.<n>A position discretization procedure is introduced to propose discrete protein sequences from such a continuous state system.<n>Experiments demonstrate that our method outperforms state-of-the-art baselines in in-silico evaluations.
- Score: 16.336540408998598
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to engineer optimized protein variants has transformative potential for biotechnology and medicine. Prior sequence-based optimization methods struggle with the high-dimensional complexities due to the epistasis effect and the disregard for structural constraints. To address this, we propose HADES, a Bayesian optimization method utilizing Hamiltonian dynamics to efficiently sample from a structure-aware approximated posterior. Leveraging momentum and uncertainty in the simulated physical movements, HADES enables rapid transition of proposals toward promising areas. A position discretization procedure is introduced to propose discrete protein sequences from such a continuous state system. The posterior surrogate is powered by a two-stage encoder-decoder framework to determine the structure and function relationships between mutant neighbors, consequently learning a smoothed landscape to sample from. Extensive experiments demonstrate that our method outperforms state-of-the-art baselines in in-silico evaluations across most metrics. Remarkably, our approach offers a unique advantage by leveraging the mutual constraints between protein structure and sequence, facilitating the design of protein sequences with similar structures and optimized properties. The code and data are publicly available at https://github.com/GENTEL-lab/HADES.
Related papers
- SaDiT: Efficient Protein Backbone Design via Latent Structural Tokenization and Diffusion Transformers [50.18388227899971]
We present SaDiT, a novel framework that accelerates protein backbone generation by integrating SaProt Tokenization with a Diffusion Transformer (DiT) architecture.<n>Experiments demonstrate that SaDiT outperforms state-of-the-art models, including RFDiffusion and Proteina, in both computational speed and structural viability.
arXiv Detail & Related papers (2026-02-06T13:50:13Z) - Leveraging Discrete Function Decomposability for Scientific Design [48.365465744654365]
In the era of AI-driven science and engineering, we often want to design objects in silico according to user-specified properties.<n>For example, we may wish to design a protein to bind its target, arrange components within a circuit to minimize latency, or find materials with certain properties.<n>We propose and demonstrate use of a new distributional optimization algorithm, De-Aware Distributional Optimization (DADO), that can leverage any decomposability defined by a junction tree on the design variables.
arXiv Detail & Related papers (2025-11-04T21:57:51Z) - ProteinAE: Protein Diffusion Autoencoders for Structure Encoding [64.77182442408254]
We introduce ProteinAE, a novel and streamlined protein diffusion autoencoder.<n>ProteinAE directly maps protein backbone coordinates from E(3) into a continuous, compact latent space.<n>We demonstrate that ProteinAE achieves state-of-the-art reconstruction quality, outperforming existing autoencoders.
arXiv Detail & Related papers (2025-10-12T14:30:32Z) - Quantum Algorithm for Protein Side-Chain Optimisation: Comparing Quantum to Classical Methods [0.0]
We develop a resource-efficient optimisation algorithm to compute the ground state energy of protein structures.<n>We propose a quantum algorithm based on the Quantum Approximate optimisation algorithm to explore the conformational space and identify low-energy configurations.
arXiv Detail & Related papers (2025-07-25T15:37:04Z) - An Efficient Quantum Approximate Optimization Algorithm with Fixed Linear Ramp Schedule for Truss Structure Optimization [0.0]
This study proposes a novel structural optimization framework based on quantum variational circuits.<n>By defining design variables as multipliers, it provides greater flexibility in adjusting the cross-sectional area of the rod.<n>As a result, the objective function is in a simple format, enabling efficient optimization using QAOA.
arXiv Detail & Related papers (2025-02-24T01:19:41Z) - Elucidating Subspace Perturbation in Zeroth-Order Optimization: Theory and Practice at Scale [33.38543010618118]
Zeroth-order (ZO) optimization has emerged as a promising alternative to gradient-based backpropagation methods.<n>We show that high dimensionality is the primary bottleneck and introduce the notion of textitsubspace alignment to explain how the subspace perturbations reduce gradient noise and accelerate convergence.<n>We propose an efficient ZO method using block coordinate descent (MeZO-BCD), which perturbs and updates only a subset of parameters at each step.
arXiv Detail & Related papers (2025-01-31T12:46:04Z) - Structure Language Models for Protein Conformation Generation [66.42864253026053]
Traditional physics-based simulation methods often struggle with sampling equilibrium conformations.<n>Deep generative models have shown promise in generating protein conformations as a more efficient alternative.<n>We introduce Structure Language Modeling as a novel framework for efficient protein conformation generation.
arXiv Detail & Related papers (2024-10-24T03:38:51Z) - Tree ensemble kernels for Bayesian optimization with known constraints
over mixed-feature spaces [54.58348769621782]
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search.
Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function.
Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
arXiv Detail & Related papers (2022-07-02T16:59:37Z) - EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z) - Fast differentiable DNA and protein sequence optimization for molecular
design [0.0]
Machine learning models that accurately predict biological fitness from sequence are becoming a powerful tool for molecular design.
Here, we build on a previously proposed straight-through approximation method to optimize through discrete sequence samples.
The resulting algorithm, which we call Fast SeqPropProp, achieves up to 100-fold faster convergence compared to previous versions.
arXiv Detail & Related papers (2020-05-22T17:03:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.