Target specific peptide design using latent space approximate trajectory
collector
- URL: http://arxiv.org/abs/2302.01435v1
- Date: Thu, 2 Feb 2023 21:56:52 GMT
- Title: Target specific peptide design using latent space approximate trajectory
collector
- Authors: Tong Lin, Sijie Chen, Ruchira Basu, Dehu Pei, Xiaolin Cheng and Levent
Burak Kara
- Abstract summary: We propose a novel machine based machine learning design, called Approximate Space Tray Collector (LSATC)
It consists of a series of samplers that approximates peptides with desired binding properties in a space $284%$.
We latent by the design of peptide extensions targeting Beta-catenin, a key effector protein involved in canonical Wnt signalling.
All the four binding peptides extended by LSATC show improved Beta-catenin binding by at least $2%$, two of the peptides show a $3$ fold increase in affinity as compared to the base peptide.
- Score: 1.7289819674602296
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Despite the prevalence and many successes of deep learning applications in de
novo molecular design, the problem of peptide generation targeting specific
proteins remains unsolved. A main barrier for this is the scarcity of the
high-quality training data. To tackle the issue, we propose a novel machine
learning based peptide design architecture, called Latent Space Approximate
Trajectory Collector (LSATC). It consists of a series of samplers on an
optimization trajectory on a highly non-convex energy landscape that
approximates the distributions of peptides with desired properties in a latent
space. The process involves little human intervention and can be implemented in
an end-to-end manner. We demonstrate the model by the design of peptide
extensions targeting Beta-catenin, a key nuclear effector protein involved in
canonical Wnt signalling. When compared with a random sampler, LSATC can sample
peptides with $36\%$ lower binding scores in a $16$ times smaller interquartile
range (IQR) and $284\%$ less hydrophobicity with a $1.4$ times smaller IQR.
LSATC also largely outperforms other common generative models. Finally, we
utilized a clustering algorithm to select 4 peptides from the 100 LSATC
designed peptides for experimental validation. The result confirms that all the
four peptides extended by LSATC show improved Beta-catenin binding by at least
$20.0\%$, and two of the peptides show a $3$ fold increase in binding affinity
as compared to the base peptide.
Related papers
- PPFlow: Target-aware Peptide Design with Torsional Flow Matching [52.567714059931646]
We propose a target-aware peptide design method called textscPPFlow to model the internal geometries of torsion angles for the peptide structure design.
Besides, we establish a protein-peptide binding dataset named PPBench2024 to fill the void of massive data for the task of structure-based peptide drug design.
arXiv Detail & Related papers (2024-03-05T13:26:42Z) - Transformer-based de novo peptide sequencing for data-independent acquisition mass spectrometry [1.338778493151964]
We introduce DiaTrans, a deep-learning model based on transformer architecture.
It deciphers peptide sequences from DIA mass spectrometry data.
Our results show significant improvements over existing STOA methods.
arXiv Detail & Related papers (2024-02-17T19:04:23Z) - ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide
Sequencing [70.12220342151113]
ContraNovo is a pioneering algorithm that leverages contrastive learning to extract the relationship between spectra and peptides.
ContraNovo consistently outshines contemporary state-of-the-art solutions.
arXiv Detail & Related papers (2023-12-18T12:49:46Z) - ProFSA: Self-supervised Pocket Pretraining via Protein
Fragment-Surroundings Alignment [20.012210194899605]
We propose a novel pocket pretraining approach that leverages knowledge from high-resolution atomic protein structures.
Our method, named ProFSA, achieves state-of-the-art performance across various tasks, including pocket druggability prediction.
Our work opens up a new avenue for mitigating the scarcity of protein-ligand complex data through the utilization of high-quality and diverse protein structure databases.
arXiv Detail & Related papers (2023-10-11T06:36:23Z) - FABind: Fast and Accurate Protein-Ligand Binding [127.7790493202716]
$mathbfFABind$ is an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding.
Our proposed model demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods.
arXiv Detail & Related papers (2023-10-10T16:39:47Z) - TacoGFN: Target-conditioned GFlowNet for Structure-based Drug Design [3.45184803671951]
TacoGFN is a novel GFlowNet-based approach for structure-based drug design.
It can generate molecules conditioned on any protein pocket structure with probabilities proportional to its affinity and property rewards.
In the generative setting for CrossDocked 2020 benchmark, TacoGFN attains a state-of-the-art success rate of $56.0%$ and $-8.44$ kcal/mol in median Vina Dock score.
arXiv Detail & Related papers (2023-10-05T00:45:04Z) - Efficient Prediction of Peptide Self-assembly through Sequential and
Graphical Encoding [57.89530563948755]
This work provides a benchmark analysis of peptide encoding with advanced deep learning models.
It serves as a guide for a wide range of peptide-related predictions such as isoelectric points, hydration free energy, etc.
arXiv Detail & Related papers (2023-07-17T00:43:33Z) - Protein Design with Guided Discrete Diffusion [67.06148688398677]
A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling.
We propose diffusioN Optimized Sampling (NOS), a guidance method for discrete diffusion models.
NOS makes it possible to perform design directly in sequence space, circumventing significant limitations of structure-based methods.
arXiv Detail & Related papers (2023-05-31T16:31:24Z) - Diversifying Design of Nucleic Acid Aptamers Using Unsupervised Machine
Learning [54.247560894146105]
Inverse design of short single-stranded RNA and DNA sequences (aptamers) is the task of finding sequences that satisfy a set of desired criteria.
We propose to use an unsupervised machine learning model known as the Potts model to discover new, useful sequences with controllable sequence diversity.
arXiv Detail & Related papers (2022-08-10T13:30:58Z) - DePS: An improved deep learning model for de novo peptide sequencing [7.468176246958974]
In this study, we proposed an enhanced model, DePS, which can improve the accuracy of de novo peptide sequencing.
For the same test set of DeepNovoV2, the DePS model achieved excellent results of 74.22%, 74.21% and 41.68% for amino acid recall, amino acid precision and peptide recall respectively.
arXiv Detail & Related papers (2022-03-16T16:45:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.