Agentic End-to-End De Novo Protein Design for Tailored Dynamics Using a Language Diffusion Model
- URL: http://arxiv.org/abs/2502.10173v1
- Date: Fri, 14 Feb 2025 14:07:54 GMT
- Title: Agentic End-to-End De Novo Protein Design for Tailored Dynamics Using a Language Diffusion Model
- Authors: Bo Ni, Markus J. Buehler,
- Abstract summary: VibeGen is a generative AI framework that enables end-to-end de novo protein design conditioned on normal mode vibrations.
Our work integrates protein dynamics into generative protein design, and establishes a direct, bidirectional link between sequence and vibrational behavior.
- Score: 0.5678271181959529
- License:
- Abstract: Proteins are dynamic molecular machines whose biological functions, spanning enzymatic catalysis, signal transduction, and structural adaptation, are intrinsically linked to their motions. Designing proteins with targeted dynamic properties, however, remains a challenge due to the complex, degenerate relationships between sequence, structure, and molecular motion. Here, we introduce VibeGen, a generative AI framework that enables end-to-end de novo protein design conditioned on normal mode vibrations. VibeGen employs an agentic dual-model architecture, comprising a protein designer that generates sequence candidates based on specified vibrational modes and a protein predictor that evaluates their dynamic accuracy. This approach synergizes diversity, accuracy, and novelty during the design process. Via full-atom molecular simulations as direct validation, we demonstrate that the designed proteins accurately reproduce the prescribed normal mode amplitudes across the backbone while adopting various stable, functionally relevant structures. Notably, generated sequences are de novo, exhibiting no significant similarity to natural proteins, thereby expanding the accessible protein space beyond evolutionary constraints. Our work integrates protein dynamics into generative protein design, and establishes a direct, bidirectional link between sequence and vibrational behavior, unlocking new pathways for engineering biomolecules with tailored dynamical and functional properties. This framework holds broad implications for the rational design of flexible enzymes, dynamic scaffolds, and biomaterials, paving the way toward dynamics-informed AI-driven protein engineering.
Related papers
- ProteinWeaver: A Divide-and-Assembly Approach for Protein Backbone Design [61.19456204667385]
We introduce ProteinWeaver, a two-stage framework for protein backbone design.
ProteinWeaver generates high-quality, novel protein backbones through versatile domain assembly.
By introducing a divide-and-assembly' paradigm, ProteinWeaver advances protein engineering and opens new avenues for functional protein design.
arXiv Detail & Related papers (2024-11-08T08:10:49Z) - SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation [97.99658944212675]
We introduce a novel pre-training strategy for protein foundation models.
It emphasizes the interactions among amino acid residues to enhance the extraction of both short-range and long-range co-evolutionary features.
Trained on a large-scale protein sequence dataset, our model demonstrates superior generalization ability.
arXiv Detail & Related papers (2024-10-31T15:22:03Z) - ProtSCAPE: Mapping the landscape of protein conformations in molecular dynamics [3.6181745062052495]
We introduce a novel deep learning architecture, Protein Transformer with Scattering, Attention, and Positional Embedding (ProtSCAPE)
ProtSCAPE utilizes the multi-scale nature of the geometric scattering transform to extract features from protein structures conceptualized as graphs.
It integrates these features with dual attention structures that focus on residues and amino acid signals, generating latent representations of protein trajectories.
arXiv Detail & Related papers (2024-10-27T02:59:48Z) - ForceGen: End-to-end de novo protein generation based on nonlinear
mechanical unfolding responses using a protein language diffusion model [0.5678271181959529]
We report a generative model that predicts protein designs to meet complex nonlinear mechanical property-design objectives.
Our model leverages deep knowledge on protein sequences from a pre-trained protein language model and maps mechanical unfolding responses to create novel proteins.
Our model offers rapid pathways to explore the enormous mechanobiological protein sequence space unconstrained by biological synthesis.
arXiv Detail & Related papers (2023-10-16T17:31:34Z) - A Latent Diffusion Model for Protein Structure Generation [50.74232632854264]
We propose a latent diffusion model that can reduce the complexity of protein modeling.
We show that our method can effectively generate novel protein backbone structures with high designability and efficiency.
arXiv Detail & Related papers (2023-05-06T19:10:19Z) - Plug & Play Directed Evolution of Proteins with Gradient-based Discrete
MCMC [1.0499611180329804]
A long-standing goal of machine-learning-based protein engineering is to accelerate the discovery of novel mutations.
We introduce a sampling framework for evolving proteins in silico that supports mixing and matching a variety of unsupervised models.
By composing these models, we aim to improve our ability to evaluate unseen mutations and constrain search to regions of sequence space likely to contain functional proteins.
arXiv Detail & Related papers (2022-12-20T00:26:23Z) - Protein Sequence and Structure Co-Design with Equivariant Translation [19.816174223173494]
Existing approaches generate both protein sequence and structure using either autoregressive models or diffusion models.
We propose a new approach capable of protein sequence and structure co-design, which iteratively translates both protein sequence and structure into the desired state.
Our model consists of a trigonometry-aware encoder that reasons geometrical constraints and interactions from context features.
All protein amino acids are updated in one shot in each translation step, which significantly accelerates the inference process.
arXiv Detail & Related papers (2022-10-17T06:00:12Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - BERTology Meets Biology: Interpreting Attention in Protein Language
Models [124.8966298974842]
We demonstrate methods for analyzing protein Transformer models through the lens of attention.
We show that attention captures the folding structure of proteins, connecting amino acids that are far apart in the underlying sequence, but spatially close in the three-dimensional structure.
We also present a three-dimensional visualization of the interaction between attention and protein structure.
arXiv Detail & Related papers (2020-06-26T21:50:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.