PepGB: Facilitating peptide drug discovery via graph neural networks
- URL: http://arxiv.org/abs/2401.14665v1
- Date: Fri, 26 Jan 2024 06:13:09 GMT
- Title: PepGB: Facilitating peptide drug discovery via graph neural networks
- Authors: Yipin Lei, Xu Wang, Meng Fang, Han Li, Xiang Li, Jianyang Zeng
- Abstract summary: We propose PepGB, a deep learning framework to facilitate peptide early drug discovery by predicting peptide-protein interactions (PepPIs)
We derive an extended version, diPepGB, to tackle the bottleneck of modeling highly imbalanced data prevalent in lead generation and optimization processes.
- Score: 36.744839520938825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Peptides offer great biomedical potential and serve as promising drug
candidates. Currently, the majority of approved peptide drugs are directly
derived from well-explored natural human peptides. It is quite necessary to
utilize advanced deep learning techniques to identify novel peptide drugs in
the vast, unexplored biochemical space. Despite various in silico methods
having been developed to accelerate peptide early drug discovery, existing
models face challenges of overfitting and lacking generalizability due to the
limited size, imbalanced distribution and inconsistent quality of experimental
data. In this study, we propose PepGB, a deep learning framework to facilitate
peptide early drug discovery by predicting peptide-protein interactions
(PepPIs). Employing graph neural networks, PepGB incorporates a fine-grained
perturbation module and a dual-view objective with contrastive learning-based
peptide pre-trained representation to predict PepPIs. Through rigorous
evaluations, we demonstrated that PepGB greatly outperforms baselines and can
accurately identify PepPIs for novel targets and peptide hits, thereby
contributing to the target identification and hit discovery processes. Next, we
derive an extended version, diPepGB, to tackle the bottleneck of modeling
highly imbalanced data prevalent in lead generation and optimization processes.
Utilizing directed edges to represent relative binding strength between two
peptide nodes, diPepGB achieves superior performance in real-world assays. In
summary, our proposed frameworks can serve as potent tools to facilitate
peptide early drug discovery.
Related papers
- E(3)-invariant diffusion model for pocket-aware peptide generation [1.9950682531209156]
We propose a new method of computer-assisted inhibitor discovery: de novo pocket-aware peptide structure and sequence generation network.
Our results demonstrate that our method achieves comparable performance to state-of-the-art models.
arXiv Detail & Related papers (2024-10-27T19:59:09Z) - Multi-Peptide: Multimodality Leveraged Language-Graph Learning of Peptide Properties [5.812284760539713]
Multi-Peptide is an innovative approach that combines transformer-based language models with Graph Neural Networks (GNNs) to predict peptide properties.
Evaluations on hemolysis and nonfouling datasets demonstrate Multi-Peptide's robustness, achieving state-of-the-art 86.185% accuracy in hemolysis prediction.
This study highlights the potential of multimodal learning in bioinformatics, paving the way for accurate and reliable predictions in peptide-based research and applications.
arXiv Detail & Related papers (2024-07-02T20:13:47Z) - NovoBench: Benchmarking Deep Learning-based De Novo Peptide Sequencing Methods in Proteomics [58.03989832372747]
We present the first unified benchmark NovoBench for emphde novo peptide sequencing.
It comprises diverse mass spectrum data, integrated models, and comprehensive evaluation metrics.
Recent methods, including DeepNovo, PointNovo, Casanovo, InstaNovo, AdaNovo and $pi$-HelixNovo are integrated into our framework.
arXiv Detail & Related papers (2024-06-16T08:23:21Z) - PPFlow: Target-aware Peptide Design with Torsional Flow Matching [52.567714059931646]
We propose a target-aware peptide design method called textscPPFlow to model the internal geometries of torsion angles for the peptide structure design.
Besides, we establish a protein-peptide binding dataset named PPBench2024 to fill the void of massive data for the task of structure-based peptide drug design.
arXiv Detail & Related papers (2024-03-05T13:26:42Z) - PepHarmony: A Multi-View Contrastive Learning Framework for Integrated
Sequence and Structure-Based Peptide Encoding [21.126660909515607]
This study introduces a novel multi-view contrastive learning framework PepHarmony for the sequence-based peptide encoding task.
We carefully select datasets from the Protein Data Bank (PDB) and AlphaFold database to encompass a broad spectrum of peptide sequences and structures.
The experimental data highlights PepHarmony's exceptional capability in capturing the intricate relationship between peptide sequences and structures.
arXiv Detail & Related papers (2024-01-21T01:16:53Z) - ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide
Sequencing [70.12220342151113]
ContraNovo is a pioneering algorithm that leverages contrastive learning to extract the relationship between spectra and peptides.
ContraNovo consistently outshines contemporary state-of-the-art solutions.
arXiv Detail & Related papers (2023-12-18T12:49:46Z) - PepLand: a large-scale pre-trained peptide representation model for a
comprehensive landscape of both canonical and non-canonical amino acids [0.4348327622270753]
PepLand is a novel pre-training architecture for representation and property analysis of peptides spanning both canonical and non-canonical amino acids.
In essence, PepLand leverages a comprehensive multi-view heterogeneous graph neural network tailored to unveil the subtle structural representations of peptides.
arXiv Detail & Related papers (2023-11-08T01:18:32Z) - Efficient Prediction of Peptide Self-assembly through Sequential and
Graphical Encoding [57.89530563948755]
This work provides a benchmark analysis of peptide encoding with advanced deep learning models.
It serves as a guide for a wide range of peptide-related predictions such as isoelectric points, hydration free energy, etc.
arXiv Detail & Related papers (2023-07-17T00:43:33Z) - SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity
Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery.
wet experiments remain the most reliable method, but they are time-consuming and resource-intensive.
Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue.
We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.