A Comprehensive Review of Emerging Approaches in Machine Learning for De Novo PROTAC Design
- URL: http://arxiv.org/abs/2406.16681v1
- Date: Mon, 24 Jun 2024 14:42:27 GMT
- Title: A Comprehensive Review of Emerging Approaches in Machine Learning for De Novo PROTAC Design
- Authors: Yossra Gharbi, RocĂo Mercado,
- Abstract summary: Targeted protein degradation (TPD) aims to regulate the intracellular levels of proteins by harnessing the cell's innate degradation pathways.
Proteolysis-targeting chimeras (PROTACs) are at the heart of TPD strategies.
Traditional methodologies for designing such complex molecules have limitations.
- Score: 1.534667887016089
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Targeted protein degradation (TPD) is a rapidly growing field in modern drug discovery that aims to regulate the intracellular levels of proteins by harnessing the cell's innate degradation pathways to selectively target and degrade disease-related proteins. This strategy creates new opportunities for therapeutic intervention in cases where occupancy-based inhibitors have not been successful. Proteolysis-targeting chimeras (PROTACs) are at the heart of TPD strategies, which leverage the ubiquitin-proteasome system for the selective targeting and proteasomal degradation of pathogenic proteins. As the field evolves, it becomes increasingly apparent that the traditional methodologies for designing such complex molecules have limitations. This has led to the use of machine learning (ML) and generative modeling to improve and accelerate the development process. In this review, we explore the impact of ML on de novo PROTAC design $-$ an aspect of molecular design that has not been comprehensively reviewed despite its significance. We delve into the distinct characteristics of PROTAC linker design, underscoring the complexities required to create effective bifunctional molecules capable of TPD. We then examine how ML in the context of fragment-based drug design (FBDD), honed in the realm of small-molecule drug discovery, is paving the way for PROTAC linker design. Our review provides a critical evaluation of the limitations inherent in applying this method to the complex field of PROTAC development. Moreover, we review existing ML works applied to PROTAC design, highlighting pioneering efforts and, importantly, the limitations these studies face. By offering insights into the current state of PROTAC development and the integral role of ML in PROTAC design, we aim to provide valuable perspectives for researchers in their pursuit of better design strategies for this new modality.
Related papers
- MSAGPT: Neural Prompting Protein Structure Prediction via MSA Generative Pre-Training [48.398329286769304]
Multiple Sequence Alignment (MSA) plays a pivotal role in unveiling the evolutionary trajectories of protein families.
MSAGPT is a novel approach to prompt protein structure predictions via MSA generative pretraining in the low MSA regime.
arXiv Detail & Related papers (2024-06-08T04:23:57Z) - Deep Lead Optimization: Leveraging Generative AI for Structural Modification [12.167178956742113]
This review delves into the basic concepts, goals, conventional CADD techniques, and recent advancements in AIDD.
We introduce a unified perspective based on constrained subgraph generation to harmonize the methodologies of de novo design and lead optimization.
arXiv Detail & Related papers (2024-04-30T03:17:42Z) - PROflow: An iterative refinement model for PROTAC-induced structure prediction [4.113597666007784]
Proteolysis targeting chimeras (PROTACs) are small molecules that trigger the breakdown of traditionally undrug'' proteins by binding simultaneously to their targets and degradation-associated proteins.
A key challenge in their rational design is understanding their structural basis of activity.
Existing PROTAC docking methods have been forced to simplify the problem into a distance-constrained protein-protein docking task.
We develop a novel pseudo-data generation scheme that requires only binary protein-protein complexes.
This new dataset enables PROflow, an iterative refinement model for PROTAC-induced structure prediction that models the full PROTAC flexibility during constrained
arXiv Detail & Related papers (2024-04-10T05:29:35Z) - A Hierarchical Training Paradigm for Antibody Structure-sequence
Co-design [54.30457372514873]
We propose a hierarchical training paradigm (HTP) for the antibody sequence-structure co-design.
HTP consists of four levels of training stages, each corresponding to a specific protein modality.
Empirical experiments show that HTP sets the new state-of-the-art performance in the co-design problem.
arXiv Detail & Related papers (2023-10-30T02:39:15Z) - Structure-informed Language Models Are Protein Designers [69.70134899296912]
We present LM-Design, a generic approach to reprogramming sequence-based protein language models (pLMs)
We conduct a structural surgery on pLMs, where a lightweight structural adapter is implanted into pLMs and endows it with structural awareness.
Experiments show that our approach outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2023-02-03T10:49:52Z) - De novo PROTAC design using graph-based deep generative models [2.566673015346446]
We show that a graph-based generative model can be used to propose PROTAC-like structures from empty graphs.
We steer the generative model towards compounds with higher likelihoods of predicted degradation activity.
After fine-tuning, predicted activity against a challenging POI increases from 50% to >80% with near-perfect chemical validity.
arXiv Detail & Related papers (2022-11-04T15:34:45Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - ODBO: Bayesian Optimization with Search Space Prescreening for Directed Protein Evolution [18.726398852721204]
We propose an efficient, experimental design-oriented closed-loop optimization framework for protein directed evolution.
ODBO employs a combination of novel low-dimensional protein encoding strategy and Bayesian optimization enhanced with search space prescreening via outlier detection.
We conduct and report four protein directed evolution experiments that substantiate the capability of the proposed framework for finding variants with properties of interest.
arXiv Detail & Related papers (2022-05-19T13:21:31Z) - Designing a Prospective COVID-19 Therapeutic with Reinforcement Learning [50.57291257437373]
SARS-CoV-2 pandemic has created a global race for a cure.
One approach focuses on designing a novel variant of the human angiotensin-converting enzyme 2 (ACE2)
We formulate a novel protein design framework as a reinforcement learning problem.
arXiv Detail & Related papers (2020-12-03T07:35:38Z) - Learning To Navigate The Synthetically Accessible Chemical Space Using
Reinforcement Learning [75.95376096628135]
We propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design.
In this setup, the agent learns to navigate through the immense synthetically accessible chemical space.
We describe how the end-to-end training in this study represents an important paradigm in radically expanding the synthesizable chemical space.
arXiv Detail & Related papers (2020-04-26T21:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.