Docking-Aware Attention: Dynamic Protein Representations through Molecular Context Integration
- URL: http://arxiv.org/abs/2502.01461v1
- Date: Mon, 03 Feb 2025 15:52:38 GMT
- Title: Docking-Aware Attention: Dynamic Protein Representations through Molecular Context Integration
- Authors: Amitay Sicherman, Kira Radinsky,
- Abstract summary: We present Docking-Aware Attention (DAA), a novel architecture that generates dynamic, context-dependent protein representations.
We evaluate our method on enzymatic reaction prediction, where it outperforms previous state-of-the-art methods.
- Score: 22.154465616964263
- License:
- Abstract: Computational prediction of enzymatic reactions represents a crucial challenge in sustainable chemical synthesis across various scientific domains, ranging from drug discovery to materials science and green chemistry. These syntheses rely on proteins that selectively catalyze complex molecular transformations. These protein catalysts exhibit remarkable substrate adaptability, with the same protein often catalyzing different chemical transformations depending on its molecular partners. Current approaches to protein representation in reaction prediction either ignore protein structure entirely or rely on static embeddings, failing to capture how proteins dynamically adapt their behavior to different substrates. We present Docking-Aware Attention (DAA), a novel architecture that generates dynamic, context-dependent protein representations by incorporating molecular docking information into the attention mechanism. DAA combines physical interaction scores from docking predictions with learned attention patterns to focus on protein regions most relevant to specific molecular interactions. We evaluate our method on enzymatic reaction prediction, where it outperforms previous state-of-the-art methods, achieving 62.2\% accuracy versus 56.79\% on complex molecules and 55.54\% versus 49.45\% on innovative reactions. Through detailed ablation studies and visualizations, we demonstrate how DAA generates interpretable attention patterns that adapt to different molecular contexts. Our approach represents a general framework for context-aware protein representation in biocatalysis prediction, with potential applications across enzymatic synthesis planning. We open-source our implementation and pre-trained models to facilitate further research.
Related papers
- Agentic End-to-End De Novo Protein Design for Tailored Dynamics Using a Language Diffusion Model [0.5678271181959529]
VibeGen is a generative AI framework that enables end-to-end de novo protein design conditioned on normal mode vibrations.
Our work integrates protein dynamics into generative protein design, and establishes a direct, bidirectional link between sequence and vibrational behavior.
arXiv Detail & Related papers (2025-02-14T14:07:54Z) - ReactEmbed: A Cross-Domain Framework for Protein-Molecule Representation Learning via Biochemical Reaction Networks [22.154465616964263]
This work enhances representations by integrating biochemical reactions encompassing interactions between molecules and proteins.
We develop ReactEmbed, a novel method that creates a unified embedding space through contrastive learning.
We evaluate ReactEmbed across diverse tasks, including drug-target interaction, protein-protein interaction, protein property prediction, and molecular property prediction, consistently surpassing all current state-of-the-art models.
arXiv Detail & Related papers (2025-01-30T11:34:03Z) - Computational Protein Science in the Era of Large Language Models (LLMs) [54.35488233989787]
Computational protein science is dedicated to revealing knowledge and developing applications within the protein sequence-structure-function paradigm.
Recently, Language Models (pLMs) have emerged as a milestone in AI due to their unprecedented language processing & generalization capability.
arXiv Detail & Related papers (2025-01-17T16:21:18Z) - Learning Chemical Reaction Representation with Reactant-Product Alignment [50.28123475356234]
RAlign is a novel chemical reaction representation learning model for various organic reaction-related tasks.
By integrating atomic correspondence between reactants and products, our model discerns the molecular transformations that occur during the reaction.
We introduce a reaction-center-aware attention mechanism that enables the model to concentrate on key functional groups.
arXiv Detail & Related papers (2024-11-26T17:41:44Z) - Long-context Protein Language Model [76.95505296417866]
Self-supervised training of language models (LMs) has seen great success for protein sequences in learning meaningful representations and for generative drug design.
Most protein LMs are based on the Transformer architecture trained on individual proteins with short context lengths.
We propose LC-PLM based on an alternative protein LM architecture, BiMamba-S, built off selective structured state-space models.
We also introduce its graph-contextual variant, LC-PLM-G, which contextualizes protein-protein interaction graphs for a second stage of training.
arXiv Detail & Related papers (2024-10-29T16:43:28Z) - ProLLM: Protein Chain-of-Thoughts Enhanced LLM for Protein-Protein Interaction Prediction [54.132290875513405]
The prediction of protein-protein interactions (PPIs) is crucial for understanding biological functions and diseases.
Previous machine learning approaches to PPI prediction mainly focus on direct physical interactions.
We propose a novel framework ProLLM that employs an LLM tailored for PPI for the first time.
arXiv Detail & Related papers (2024-03-30T05:32:42Z) - Improved K-mer Based Prediction of Protein-Protein Interactions With
Chaos Game Representation, Deep Learning and Reduced Representation Bias [0.0]
We present a method for extracting unique pairs from an interaction dataset, generating non-redundant paired data for unbiased machine learning.
We develop a convolutional neural network model capable of learning and predicting interactions from Chaos Game Representations of proteins' coding genes.
arXiv Detail & Related papers (2023-10-23T10:02:23Z) - Growing ecosystem of deep learning methods for modeling
protein$\unicode{x2013}$protein interactions [0.0]
We discuss the growing ecosystem of deep learning methods for modeling protein interactions.
Opportunities abound to discover novel interactions, modulate their physical mechanisms, and engineer binders to unravel their functions.
arXiv Detail & Related papers (2023-10-10T15:53:27Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - Encoding protein dynamic information in graph representation for
functional residue identification [0.0]
Recent advances in protein function prediction exploit graph-based deep learning approaches to correlate the structural and topological features of proteins with their molecular functions.
Here we apply normal mode analysis to native protein conformations and augment protein graphs by connecting edges between dynamically correlated residue pairs.
The proposed graph neural network, ProDAR, increases the interpretability and generalizability of residue-level annotations and robustly reflects structural nuance in proteins.
arXiv Detail & Related papers (2021-12-15T17:57:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.