PepINVENT: Generative peptide design beyond the natural amino acids
- URL: http://arxiv.org/abs/2409.14040v1
- Date: Sat, 21 Sep 2024 06:53:03 GMT
- Title: PepINVENT: Generative peptide design beyond the natural amino acids
- Authors: Gökçe Geylan, Jon Paul Janet, Alessandro Tibo, Jiazhen He, Atanas Patronov, Mikhail Kabeshov, Florian David, Werngard Czechtizky, Ola Engkvist, Leonardo De Maria,
- Abstract summary: PepINVENT navigates the vast space of natural and non-natural amino acids to propose valid, novel, and diverse peptide designs.
PepINVENT coupled with reinforcement learning enables the goal-oriented design of peptides using its chemistry-informed generative capabilities.
- Score: 34.04968462561752
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Peptides play a crucial role in the drug design and discovery whether as a therapeutic modality or a delivery agent. Non-natural amino acids (NNAAs) have been used to enhance the peptide properties from binding affinity, plasma stability to permeability. Incorporating novel NNAAs facilitates the design of more effective peptides with improved properties. The generative models used in the field, have focused on navigating the peptide sequence space. The sequence space is formed by combinations of a predefined set of amino acids. However, there is still a need for a tool to explore the peptide landscape beyond this enumerated space to unlock and effectively incorporate de novo design of new amino acids. To thoroughly explore the theoretical chemical space of the peptides, we present PepINVENT, a novel generative AI-based tool as an extension to the small molecule molecular design platform, REINVENT. PepINVENT navigates the vast space of natural and non-natural amino acids to propose valid, novel, and diverse peptide designs. The generative model can serve as a central tool for peptide-related tasks, as it was not trained on peptides with specific properties or topologies. The prior was trained to understand the granularity of peptides and to design amino acids for filling the masked positions within a peptide. PepINVENT coupled with reinforcement learning enables the goal-oriented design of peptides using its chemistry-informed generative capabilities. This study demonstrates PepINVENT's ability to explore the peptide space with unique and novel designs, and its capacity for property optimization in the context of therapeutically relevant peptides. Our tool can be employed for multi-parameter learning objectives, peptidomimetics, lead optimization, and variety of other tasks within the peptide domain.
Related papers
- Peptide-GPT: Generative Design of Peptides using Generative Pre-trained Transformers and Bio-informatic Supervision [7.275932354889042]
We introduce a protein language model tailored to generate protein sequences with distinct properties.
We rank the generated sequences based on their perplexity scores, then we filter out those lying outside the permissible convex hull of proteins.
We achieved an accuracy of 76.26% in hemolytic, 72.46% in non-hemolytic, 78.84% in non-fouling, and 68.06% in solubility protein generation.
arXiv Detail & Related papers (2024-10-25T00:15:39Z) - Exploring Latent Space for Generating Peptide Analogs Using Protein Language Models [1.5146068448101742]
The proposed method requires only a single sequence of interest, avoiding the need for large datasets.
Our results show significant improvements over baseline models in similarity indicators of peptide structures, descriptors and bioactivities.
arXiv Detail & Related papers (2024-08-15T13:37:27Z) - Full-Atom Peptide Design based on Multi-modal Flow Matching [32.58558711545861]
We present PepFlow, the first multi-modal deep generative model grounded in the flow-matching framework for the design of full-atom peptides.
We characterize the peptide structure using rigid backbone frames within the $mathrmSE(3)$ manifold and side-chain angles on high-dimensional tori.
Our approach adeptly tackles various tasks such as fix-backbone sequence design and side-chain packing through partial sampling.
arXiv Detail & Related papers (2024-06-02T12:59:54Z) - PPFlow: Target-aware Peptide Design with Torsional Flow Matching [52.567714059931646]
We propose a target-aware peptide design method called textscPPFlow to model the internal geometries of torsion angles for the peptide structure design.
Besides, we establish a protein-peptide binding dataset named PPBench2024 to fill the void of massive data for the task of structure-based peptide drug design.
arXiv Detail & Related papers (2024-03-05T13:26:42Z) - PepGB: Facilitating peptide drug discovery via graph neural networks [36.744839520938825]
We propose PepGB, a deep learning framework to facilitate peptide early drug discovery by predicting peptide-protein interactions (PepPIs)
We derive an extended version, diPepGB, to tackle the bottleneck of modeling highly imbalanced data prevalent in lead generation and optimization processes.
arXiv Detail & Related papers (2024-01-26T06:13:09Z) - PepLand: a large-scale pre-trained peptide representation model for a
comprehensive landscape of both canonical and non-canonical amino acids [0.4348327622270753]
PepLand is a novel pre-training architecture for representation and property analysis of peptides spanning both canonical and non-canonical amino acids.
In essence, PepLand leverages a comprehensive multi-view heterogeneous graph neural network tailored to unveil the subtle structural representations of peptides.
arXiv Detail & Related papers (2023-11-08T01:18:32Z) - A Hierarchical Training Paradigm for Antibody Structure-sequence
Co-design [54.30457372514873]
We propose a hierarchical training paradigm (HTP) for the antibody sequence-structure co-design.
HTP consists of four levels of training stages, each corresponding to a specific protein modality.
Empirical experiments show that HTP sets the new state-of-the-art performance in the co-design problem.
arXiv Detail & Related papers (2023-10-30T02:39:15Z) - Efficient Prediction of Peptide Self-assembly through Sequential and
Graphical Encoding [57.89530563948755]
This work provides a benchmark analysis of peptide encoding with advanced deep learning models.
It serves as a guide for a wide range of peptide-related predictions such as isoelectric points, hydration free energy, etc.
arXiv Detail & Related papers (2023-07-17T00:43:33Z) - Diversifying Design of Nucleic Acid Aptamers Using Unsupervised Machine
Learning [54.247560894146105]
Inverse design of short single-stranded RNA and DNA sequences (aptamers) is the task of finding sequences that satisfy a set of desired criteria.
We propose to use an unsupervised machine learning model known as the Potts model to discover new, useful sequences with controllable sequence diversity.
arXiv Detail & Related papers (2022-08-10T13:30:58Z) - Using Genetic Programming to Predict and Optimize Protein Function [65.25258357832584]
We propose POET, a computational Genetic Programming tool based on evolutionary methods to enhance screening and mutagenesis in Directed Evolution.
As a proof-of-concept we use peptides that generate MRI contrast detected by the Chemical Exchange Saturation Transfer mechanism.
Our results indicate that a computational modelling tool like POET can help to find peptides with 400% better functionality than used before.
arXiv Detail & Related papers (2022-02-08T18:08:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.