Computational design of target-specific linear peptide binders with TransformerBeta
- URL: http://arxiv.org/abs/2410.16302v1
- Date: Mon, 07 Oct 2024 08:52:54 GMT
- Title: Computational design of target-specific linear peptide binders with TransformerBeta
- Authors: Haowen Zhao, Francesco A. Aprile, Barbara Bravi,
- Abstract summary: We build an unprecedentedly large-scale library of peptide pairs within stable secondary structures (beta sheets)
We then developed a machine learning method based on the Transformer architecture for the design of specific linear binders.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The computational prediction and design of peptide binders targeting specific linear epitopes is crucial in biological and biomedical research, yet it remains challenging due to their highly dynamic nature and the scarcity of experimentally solved binding data. To address this problem, we built an unprecedentedly large-scale library of peptide pairs within stable secondary structures (beta sheets), leveraging newly available AlphaFold predicted structures. We then developed a machine learning method based on the Transformer architecture for the design of specific linear binders, in analogy to a language translation task. Our method, TransformerBeta, accurately predicts specific beta strand interactions and samples sequences with beta sheet-like molecular properties, while capturing interpretable physico-chemical interaction patterns. As such, it can propose specific candidate binders targeting linear epitope for experimental validation to inform protein design.
Related papers
- Morphology-Specific Peptide Discovery via Masked Conditional Generative Modeling [0.0]
PepMorph is an end-to-end peptide discovery pipeline.<n>It generates sequences prone to aggregate but self-assemble into a specified fibrillar or spherical morphology.
arXiv Detail & Related papers (2025-09-02T07:58:12Z) - Generation of structure-guided pMHC-I libraries using Diffusion Models [0.0]
We introduce a structure-guided benchmark of pMHC-I peptides designed using diffusion models conditioned on crystal distances.<n>This benchmark is independent of previously characterized peptides yet reproduces canonical anchor residue preferences.<n>We demonstrate that state-of-the-art sequence-based predictors perform poorly at recognizing the binding potential of these structurally stable designs.
arXiv Detail & Related papers (2025-07-11T08:29:18Z) - evoBPE: Evolutionary Protein Sequence Tokenization [3.4196611972116786]
Current subword tokenization techniques, primarily developed for natural language processing, often fail to represent protein sequences' complex structural and functional properties adequately.
This study introduces evoBPE, a novel tokenization approach that integrates evolutionary mutation patterns into sequence segmentation.
evoBPE opens new possibilities for machine learning applications in protein function prediction, structural modeling, and evolutionary analysis.
arXiv Detail & Related papers (2025-03-11T19:19:48Z) - Life-Code: Central Dogma Modeling with Multi-Omics Sequence Unification [53.488387420073536]
Life-Code is a comprehensive framework that spans different biological functions.
Life-Code achieves state-of-the-art performance on various tasks across three omics.
arXiv Detail & Related papers (2025-02-11T06:53:59Z) - Exploring Latent Space for Generating Peptide Analogs Using Protein Language Models [1.5146068448101742]
The proposed method requires only a single sequence of interest, avoiding the need for large datasets.
Our results show significant improvements over baseline models in similarity indicators of peptide structures, descriptors and bioactivities.
arXiv Detail & Related papers (2024-08-15T13:37:27Z) - Multi-Peptide: Multimodality Leveraged Language-Graph Learning of Peptide Properties [5.812284760539713]
Multi-Peptide is an innovative approach that combines transformer-based language models with Graph Neural Networks (GNNs) to predict peptide properties.
Evaluations on hemolysis and nonfouling datasets demonstrate Multi-Peptide's robustness, achieving state-of-the-art 86.185% accuracy in hemolysis prediction.
This study highlights the potential of multimodal learning in bioinformatics, paving the way for accurate and reliable predictions in peptide-based research and applications.
arXiv Detail & Related papers (2024-07-02T20:13:47Z) - Protein binding affinity prediction under multiple substitutions applying eGNNs on Residue and Atomic graphs combined with Language model information: eGRAL [1.840390797252648]
Deep learning is increasingly recognized as a powerful tool capable of bridging the gap between in-silico predictions and in-vitro observations.
We propose eGRAL, a novel graph neural network architecture designed for predicting binding affinity changes from amino acid substitutions in protein complexes.
eGRAL leverages residue, atomic and evolutionary scales, thanks to features extracted from protein large language models.
arXiv Detail & Related papers (2024-05-03T10:33:19Z) - PPFlow: Target-aware Peptide Design with Torsional Flow Matching [52.567714059931646]
We propose a target-aware peptide design method called textscPPFlow to model the internal geometries of torsion angles for the peptide structure design.
Besides, we establish a protein-peptide binding dataset named PPBench2024 to fill the void of massive data for the task of structure-based peptide drug design.
arXiv Detail & Related papers (2024-03-05T13:26:42Z) - Efficient Prediction of Peptide Self-assembly through Sequential and
Graphical Encoding [57.89530563948755]
This work provides a benchmark analysis of peptide encoding with advanced deep learning models.
It serves as a guide for a wide range of peptide-related predictions such as isoelectric points, hydration free energy, etc.
arXiv Detail & Related papers (2023-07-17T00:43:33Z) - Multi-task Bioassay Pre-training for Protein-ligand Binding Affinity
Prediction [26.530876904939163]
We propose Multi-task Bioassay Pre-training (MBP), a pre-training framework for structure-based PLBA prediction.
MBP learns robust and transferrable structural knowledge from our new ChEMBL-Dock dataset with varied and noisy labels.
arXiv Detail & Related papers (2023-06-08T02:29:49Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - A deep learning driven pseudospectral PCE based FFT homogenization
algorithm for complex microstructures [68.8204255655161]
It is shown that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.
It is shown, that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.
arXiv Detail & Related papers (2021-10-26T07:02:14Z) - EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.