HMD-AMP: Protein Language-Powered Hierarchical Multi-label Deep Forest
for Annotating Antimicrobial Peptides
- URL: http://arxiv.org/abs/2111.06023v1
- Date: Thu, 11 Nov 2021 02:10:07 GMT
- Title: HMD-AMP: Protein Language-Powered Hierarchical Multi-label Deep Forest
for Annotating Antimicrobial Peptides
- Authors: Qinze Yu, Zhihang Dong, Xingyu Fan, Licheng Zong and Yu Li
- Abstract summary: We build a diverse and comprehensive multi-label protein sequence database by collecting and cleaning amino acids from various AMP databases.
We develop an end-to-end hierarchical multi-label deep forest framework, HMD-AMP, to annotate AMP comprehensively.
After identifying an AMP, it further predicts what targets the AMP can effectively kill from eleven available classes.
- Score: 5.61222966894307
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Identifying the targets of an antimicrobial peptide is a fundamental step in
studying the innate immune response and combating antibiotic resistance, and
more broadly, precision medicine and public health. There have been extensive
studies on the statistical and computational approaches to identify (i) whether
a peptide is an antimicrobial peptide (AMP) or a non-AMP and (ii) which targets
are these sequences effective to (Gram-positive, Gram-negative, etc.). Despite
the existing deep learning methods on this problem, most of them are unable to
handle the small AMP classes (anti-insect, anti-parasite, etc.). And more
importantly, some AMPs can have multiple targets, which the previous methods
fail to consider. In this study, we build a diverse and comprehensive
multi-label protein sequence database by collecting and cleaning amino acids
from various AMP databases. To generate efficient representations and features
for the small classes dataset, we take advantage of a protein language model
trained on 250 million protein sequences. Based on that, we develop an
end-to-end hierarchical multi-label deep forest framework, HMD-AMP, to annotate
AMP comprehensively. After identifying an AMP, it further predicts what targets
the AMP can effectively kill from eleven available classes. Extensive
experiments suggest that our framework outperforms state-of-the-art models in
both the binary classification task and the multi-label classification task,
especially on the minor classes.The model is robust against reduced features
and small perturbations and produces promising results. We believe HMD-AMP
contributes to both the future wet-lab investigations of the innate structural
properties of different antimicrobial peptides and build promising empirical
underpinnings for precise medicine with antibiotics.
Related papers
- SGAC: A Graph Neural Network Framework for Imbalanced and Structure-Aware AMP Classification [7.044114650607729]
Classifying antimicrobial peptides(AMPs) from the vast array of peptides mined from metagenomic sequencing data is a significant approach to addressing the issue of antibiotic resistance.
Current AMP classification methods, primarily relying on sequence-based data, neglect the spatial structure of peptides, thereby limiting the accurate classification of AMPs.
arXiv Detail & Related papers (2024-12-20T17:17:57Z) - HMAMP: Hypervolume-Driven Multi-Objective Antimicrobial Peptides Design [11.891046340221735]
This paper introduces a paradigm shift by considering multiple attributes in Antimicrobial peptides (AMPs) design.
By synergizing reinforcement learning and a descent algorithm rooted in the hypervolume of AMP concept, HMAMP effectively expands exploration space and mitigates the issue of pattern collapse.
A detailed analysis of the helical structures and molecular dynamics simulations for ten potential candidate AMPs validates the superiority of HMAMP in the realm of multi-objective AMP design.
arXiv Detail & Related papers (2024-05-01T07:17:59Z) - AMPCliff: quantitative definition and benchmarking of activity cliffs in antimicrobial peptides [4.826446796830595]
This study introduces a quantitative definition and benchmarking framework AMPCliff for the AC phenomenon in antimicrobial peptides (AMPs) composed by canonical amino acids.
AMPCliff quantifies the activities of AMPs by the MIC, and defines 0.9 as the minimum threshold for the normalized BLOSUM62 similarity score between a pair of aligned peptides with at least two-fold MIC changes.
Our analysis reveals that these models are capable of detecting AMP AC events and the pre-trained protein language model ESM2 demonstrates superior performance across the evaluations.
arXiv Detail & Related papers (2024-04-15T12:40:12Z) - MAPE-PPI: Towards Effective and Efficient Protein-Protein Interaction
Prediction via Microenvironment-Aware Protein Embedding [82.31506767274841]
Protein-Protein Interactions (PPIs) are fundamental in various biological processes and play a key role in life activities.
MPAE-PPI encodes microenvironments into chemically meaningful discrete codes via a sufficiently large microenvironment "vocabulary"
MPAE-PPI can scale to PPI prediction with millions of PPIs with superior trade-offs between effectiveness and computational efficiency.
arXiv Detail & Related papers (2024-02-22T09:04:41Z) - Accelerating Antimicrobial Peptide Discovery with Latent Structure [33.288514128470425]
We propose a latent sequence-structure model for designing AMPs (LSSAMP)
LSSAMP exploits multi-scale vector quantization in the latent space to represent secondary structures.
Experimental results show that the peptides generated by LSSAMP have a high probability of antimicrobial activity.
arXiv Detail & Related papers (2022-11-28T06:43:32Z) - Reprogramming Pretrained Language Models for Antibody Sequence Infilling [72.13295049594585]
Computational design of antibodies involves generating novel and diverse sequences, while maintaining structural consistency.
Recent deep learning models have shown impressive results, however the limited number of known antibody sequence/structure pairs frequently leads to degraded performance.
In our work we address this challenge by leveraging Model Reprogramming (MR), which repurposes pretrained models on a source language to adapt to the tasks that are in a different language and have scarce data.
arXiv Detail & Related papers (2022-10-05T20:44:55Z) - Graph-Based Active Machine Learning Method for Diverse and Novel
Antimicrobial Peptides Generation and Selection [57.131117785001194]
Large-scale screening of new AMP candidates is expensive, time-consuming, and now affordable in developing countries.
We propose a novel active machine learning-based framework that statistically minimizes the number of wet-lab experiments needed to design new AMPs.
arXiv Detail & Related papers (2022-09-18T14:30:48Z) - Biological Sequence Design with GFlowNets [75.1642973538266]
Design of de novo biological sequences with desired properties often involves an active loop with several rounds of molecule ideation and expensive wet-lab evaluations.
This makes the diversity of proposed candidates a key consideration in the ideation phase.
We propose an active learning algorithm leveraging uncertainty estimation and the recently proposed GFlowNets as a generator of diverse candidate solutions.
arXiv Detail & Related papers (2022-03-02T15:53:38Z) - A k-mer Based Approach for SARS-CoV-2 Variant Identification [55.78588835407174]
We show that preserving the order of the amino acids helps the underlying classifiers to achieve better performance.
We also show the importance of the different amino acids which play a key role in identifying variants and how they coincide with those reported by the USA's Centers for Disease Control and Prevention (CDC)
arXiv Detail & Related papers (2021-08-07T15:08:15Z) - Accelerating Antimicrobial Discovery with Controllable Deep Generative
Models and Molecular Dynamics [109.70543391923344]
CLaSS (Controlled Latent attribute Space Sampling) is an efficient computational method for attribute-controlled generation of molecules.
We screen the generated molecules for additional key attributes by using deep learning classifiers in conjunction with novel features derived from atomistic simulations.
The proposed approach is demonstrated for designing non-toxic antimicrobial peptides (AMPs) with strong broad-spectrum potency.
arXiv Detail & Related papers (2020-05-22T15:57:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.