DPST: De Novo Peptide Sequencing with Amino-Acid-Aware Transformers
- URL: http://arxiv.org/abs/2203.13132v1
- Date: Wed, 23 Mar 2022 08:01:06 GMT
- Title: DPST: De Novo Peptide Sequencing with Amino-Acid-Aware Transformers
- Authors: Yan Yang and Zakir Hossain and Khandaker Asif and Liyuan Pan and
Shafin Rahman and Eric Stone
- Abstract summary: De novo peptide sequencing aims to recover amino acid sequences of a peptide from tandem mass spectrometry (MS) data.
Existing approaches for de novo analysis enumerate MS evidence for all amino acid classes during inference.
Our approach, DPST, circumvents these limitations with two key components.
- Score: 11.527280359634524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: De novo peptide sequencing aims to recover amino acid sequences of a peptide
from tandem mass spectrometry (MS) data. Existing approaches for de novo
analysis enumerate MS evidence for all amino acid classes during inference. It
leads to over-trimming on receptive fields of MS data and restricts MS evidence
associated with following undecoded amino acids. Our approach, DPST,
circumvents these limitations with two key components: (1) A confidence value
aggregation encoder to sketch spectrum representations according to
amino-acid-based connectivity among MS; (2) A global-local fusion decoder to
progressively assimilate contextualized spectrum representations with a
predefined preconception of localized MS evidence and amino acid priors. Our
components originate from a closed-form solution and selectively attend to
informative amino-acid-aware MS representations. Through extensive empirical
studies, we demonstrate the superiority of DPST, showing that it outperforms
state-of-the-art approaches by a margin of 12% - 19% peptide accuracy.
Related papers
- MeToken: Uniform Micro-environment Token Boosts Post-Translational Modification Prediction [65.33218256339151]
Post-translational modifications (PTMs) profoundly expand the complexity and functionality of the proteome.
Existing computational approaches predominantly focus on protein sequences to predict PTM sites, driven by the recognition of sequence-dependent motifs.
We introduce the MeToken model, which tokenizes the micro-environment of each acid, integrating both sequence and structural information into unified discrete tokens.
arXiv Detail & Related papers (2024-11-04T07:14:28Z) - NovoBench: Benchmarking Deep Learning-based De Novo Peptide Sequencing Methods in Proteomics [58.03989832372747]
We present the first unified benchmark NovoBench for emphde novo peptide sequencing.
It comprises diverse mass spectrum data, integrated models, and comprehensive evaluation metrics.
Recent methods, including DeepNovo, PointNovo, Casanovo, InstaNovo, AdaNovo and $pi$-HelixNovo are integrated into our framework.
arXiv Detail & Related papers (2024-06-16T08:23:21Z) - AdaNovo: Adaptive \emph{De Novo} Peptide Sequencing with Conditional Mutual Information [46.23980841020632]
We propose AdaNovo, a novel framework that calculates conditional mutual information (CMI) between the spectrum and each amino acid/peptide.
AdaNovo excels in identifying amino acids with post-translational modifications (PTMs) and exhibits robustness against data noise.
arXiv Detail & Related papers (2024-03-09T11:54:58Z) - Transformer-based de novo peptide sequencing for data-independent acquisition mass spectrometry [1.338778493151964]
We introduce DiaTrans, a deep-learning model based on transformer architecture.
It deciphers peptide sequences from DIA mass spectrometry data.
Our results show significant improvements over existing STOA methods.
arXiv Detail & Related papers (2024-02-17T19:04:23Z) - ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide
Sequencing [70.12220342151113]
ContraNovo is a pioneering algorithm that leverages contrastive learning to extract the relationship between spectra and peptides.
ContraNovo consistently outshines contemporary state-of-the-art solutions.
arXiv Detail & Related papers (2023-12-18T12:49:46Z) - MATE-Pred: Multimodal Attention-based TCR-Epitope interaction Predictor [1.933856957193398]
An accurate binding prediction between T-cell receptors ands contributes decisively to successful immunotherapy strategies.
Here, we propose a highly reliable novel method, MATE-Pred, that performs attention-based prediction of T-cell receptors and affinitys binding regimes.
The performance of MATE-Pred projects its potential application in drug discovery.
arXiv Detail & Related papers (2023-12-05T11:30:00Z) - Efficient Prediction of Peptide Self-assembly through Sequential and
Graphical Encoding [57.89530563948755]
This work provides a benchmark analysis of peptide encoding with advanced deep learning models.
It serves as a guide for a wide range of peptide-related predictions such as isoelectric points, hydration free energy, etc.
arXiv Detail & Related papers (2023-07-17T00:43:33Z) - Ranking-based Convolutional Neural Network Models for Peptide-MHC
Binding Prediction [15.932922003001034]
identifying peptides that can bind to MHC class-I molecules plays a vital role in the design of peptide vaccines.
We develop two allele-specific CNN-based methods named ConvM and SpConvM to tackle the binding prediction problem.
arXiv Detail & Related papers (2020-12-04T20:40:36Z) - Confidence-guided Lesion Mask-based Simultaneous Synthesis of Anatomic
and Molecular MR Images in Patients with Post-treatment Malignant Gliomas [65.64363834322333]
Confidence Guided SAMR (CG-SAMR) synthesizes data from lesion information to multi-modal anatomic sequences.
module guides the synthesis based on confidence measure about the intermediate results.
experiments on real clinical data demonstrate that the proposed model can perform better than the state-of-theart synthesis methods.
arXiv Detail & Related papers (2020-08-06T20:20:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.