PROflow: An iterative refinement model for PROTAC-induced structure prediction
- URL: http://arxiv.org/abs/2405.06654v1
- Date: Wed, 10 Apr 2024 05:29:35 GMT
- Title: PROflow: An iterative refinement model for PROTAC-induced structure prediction
- Authors: Bo Qiang, Wenxian Shi, Yuxuan Song, Menghua Wu,
- Abstract summary: Proteolysis targeting chimeras (PROTACs) are small molecules that trigger the breakdown of traditionally undrug'' proteins by binding simultaneously to their targets and degradation-associated proteins.
A key challenge in their rational design is understanding their structural basis of activity.
Existing PROTAC docking methods have been forced to simplify the problem into a distance-constrained protein-protein docking task.
We develop a novel pseudo-data generation scheme that requires only binary protein-protein complexes.
This new dataset enables PROflow, an iterative refinement model for PROTAC-induced structure prediction that models the full PROTAC flexibility during constrained
- Score: 4.113597666007784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Proteolysis targeting chimeras (PROTACs) are small molecules that trigger the breakdown of traditionally ``undruggable'' proteins by binding simultaneously to their targets and degradation-associated proteins. A key challenge in their rational design is understanding their structural basis of activity. Due to the lack of crystal structures (18 in the PDB), existing PROTAC docking methods have been forced to simplify the problem into a distance-constrained protein-protein docking task. To address the data issue, we develop a novel pseudo-data generation scheme that requires only binary protein-protein complexes. This new dataset enables PROflow, an iterative refinement model for PROTAC-induced structure prediction that models the full PROTAC flexibility during constrained protein-protein docking. PROflow outperforms the state-of-the-art across docking metrics and runtime. Its inference speed enables the large-scale screening of PROTAC designs, and computed properties of predicted structures achieve statistically significant correlations with published degradation activities.
Related papers
- SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation [97.99658944212675]
We introduce a novel pre-training strategy for protein foundation models.
It emphasizes the interactions among amino acid residues to enhance the extraction of both short-range and long-range co-evolutionary features.
Trained on a large-scale protein sequence dataset, our model demonstrates superior generalization ability.
arXiv Detail & Related papers (2024-10-31T15:22:03Z) - A Comprehensive Review of Emerging Approaches in Machine Learning for De Novo PROTAC Design [1.534667887016089]
Targeted protein degradation (TPD) aims to regulate the intracellular levels of proteins by harnessing the cell's innate degradation pathways.
Proteolysis-targeting chimeras (PROTACs) are at the heart of TPD strategies.
Traditional methodologies for designing such complex molecules have limitations.
arXiv Detail & Related papers (2024-06-24T14:42:27Z) - Endowing Protein Language Models with Structural Knowledge [5.587293092389789]
We introduce a novel framework that enhances protein language models by integrating protein structural data.
The refined model, termed Protein Structure Transformer (PST), is further pretrained on a small protein structure database.
PST consistently outperforms the state-of-the-art foundation model for protein sequences, ESM-2, setting a new benchmark in protein function prediction.
arXiv Detail & Related papers (2024-01-26T12:47:54Z) - An approach to solve the coarse-grained Protein folding problem in a
Quantum Computer [0.0]
Understanding protein structures and enzymes plays a critical role in target based drug designing, elucidating protein-related disease mechanisms, and innovating novel enzymes.
Recent advancements in AI based protein structure prediction methods have solved the protein folding problem to an extent, but their precision in determining the structure of the protein with low sequence similarity is limited.
In this work we developed a novel turn based encoding algorithm that can be run on a gate based quantum computer for predicting the structure of smaller protein sequences.
arXiv Detail & Related papers (2023-11-23T18:20:05Z) - Structure-informed Language Models Are Protein Designers [69.70134899296912]
We present LM-Design, a generic approach to reprogramming sequence-based protein language models (pLMs)
We conduct a structural surgery on pLMs, where a lightweight structural adapter is implanted into pLMs and endows it with structural awareness.
Experiments show that our approach outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2023-02-03T10:49:52Z) - De novo PROTAC design using graph-based deep generative models [2.566673015346446]
We show that a graph-based generative model can be used to propose PROTAC-like structures from empty graphs.
We steer the generative model towards compounds with higher likelihoods of predicted degradation activity.
After fine-tuning, predicted activity against a challenging POI increases from 50% to >80% with near-perfect chemical validity.
arXiv Detail & Related papers (2022-11-04T15:34:45Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z) - Transfer Learning for Protein Structure Classification at Low Resolution [124.5573289131546]
We show that it is possible to make accurate ($geq$80%) predictions of protein class and architecture from structures determined at low ($leq$3A) resolution.
We provide proof of concept for high-speed, low-cost protein structure classification at low resolution, and a basis for extension to prediction of function.
arXiv Detail & Related papers (2020-08-11T15:01:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.