Accurate Virus Identification with Interpretable Raman Signatures by
Machine Learning
- URL: http://arxiv.org/abs/2206.02788v1
- Date: Sun, 5 Jun 2022 22:31:14 GMT
- Title: Accurate Virus Identification with Interpretable Raman Signatures by
Machine Learning
- Authors: Jiarong Ye, Yin-Ting Yeh, Yuan Xue, Ziyang Wang, Na Zhang, He Liu,
Kunyan Zhang, RyeAnne Ricker, Zhuohang Yu, Allison Roder, Nestor Perea Lopez,
Lindsey Organtini, Wallace Greene, Susan Hafenstein, Huaguang Lu, Elodie
Ghedin, Mauricio Terrones, Shengxi Huang, Sharon Xiaolei Huang
- Abstract summary: We present a machine learning approach for analyzing Raman spectra of human and avian viruses.
A Convolutional Neural Network (CNN) classifier specifically designed for spectral data achieves very high accuracy for a variety of virus type or subtype identification tasks.
- Score: 12.184128048998906
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Rapid identification of newly emerging or circulating viruses is an important
first step toward managing the public health response to potential outbreaks. A
portable virus capture device coupled with label-free Raman Spectroscopy holds
the promise of fast detection by rapidly obtaining the Raman signature of a
virus followed by a machine learning approach applied to recognize the virus
based on its Raman spectrum, which is used as a fingerprint. We present such a
machine learning approach for analyzing Raman spectra of human and avian
viruses. A Convolutional Neural Network (CNN) classifier specifically designed
for spectral data achieves very high accuracy for a variety of virus type or
subtype identification tasks. In particular, it achieves 99% accuracy for
classifying influenza virus type A vs. type B, 96% accuracy for classifying
four subtypes of influenza A, 95% accuracy for differentiating enveloped and
non-enveloped viruses, and 99% accuracy for differentiating avian coronavirus
(infectious bronchitis virus, IBV) from other avian viruses. Furthermore,
interpretation of neural net responses in the trained CNN model using a
full-gradient algorithm highlights Raman spectral ranges that are most
important to virus identification. By correlating ML-selected salient Raman
ranges with the signature ranges of known biomolecules and chemical functional
groups (for example, amide, amino acid, carboxylic acid), we verify that our ML
model effectively recognizes the Raman signatures of proteins, lipids and other
vital functional groups present in different viruses and uses a weighted
combination of these signatures to identify viruses.
Related papers
- MMIL: A novel algorithm for disease associated cell type discovery [58.044870442206914]
Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease.
We introduce Mixture Modeling for Multiple Learning Instance (MMIL), an expectation method that enables the training and calibration of cell-level classifiers.
arXiv Detail & Related papers (2024-06-12T15:22:56Z) - Heterogeneous virus classification using a functional deep learning model based on transmission electron microscopy images (Preprint) [2.1346640951813165]
The analysis of Transmission Electron Microscopy (TEM) images has been proven to be quite successful in instant virus identification.
This article proposes a deep learning-based classification model to identify the type of virus within those images correctly.
Experimental results show that it can differentiate among the 14 types of viruses present in the dataset with a maximum of 97.44% classification accuracy and F1-score.
arXiv Detail & Related papers (2024-05-24T13:52:14Z) - Virus2Vec: Viral Sequence Classification Using Machine Learning [48.40285316053593]
We propose Virus2Vec, a feature-vector representation for viral sequences that enable machine learning models to identify viral hosts.
We empirically evaluate Virus2Vec on real-world spike sequences of Coronaviridae and rabies virus sequence data to predict the host.
Our results demonstrate that Virus2Vec outperforms the predictive accuracies of baseline and state-of-the-art methods.
arXiv Detail & Related papers (2023-04-24T08:17:16Z) - PCD2Vec: A Poisson Correction Distance-Based Approach for Viral Host
Classification [0.966840768820136]
Coronaviruses are membrane-enveloped, non-segmented positive-strand RNA viruses belonging to the Coronaviridae family.
In the Coronavirus genome, an essential structural region is the spike region, and it's responsible for attaching the virus to the host cell membrane.
We propose a novel method for predicting the host specificity of coronaviruses by analyzing spike protein sequences from different viral subgenera and species.
arXiv Detail & Related papers (2023-04-13T03:02:22Z) - Dense Feature Memory Augmented Transformers for COVID-19 Vaccination
Search Classification [60.49594822215981]
This paper presents a classification model for detecting COVID-19 vaccination related search queries.
We propose a novel approach of considering dense features as memory tokens that the model can attend to.
We show that this new modeling approach enables a significant improvement to the Vaccine Search Insights (VSI) task.
arXiv Detail & Related papers (2022-12-16T13:57:41Z) - Dive into Machine Learning Algorithms for Influenza Virus Host Prediction with Hemagglutinin Sequences [4.289396744209968]
Influenza viruses mutate rapidly and can pose a threat to public health, especially to those in vulnerable groups.
Recently, there has been increasing interest in using machine learning algorithms to provide fast and accurate predictions for viral sequences.
In this study, real testing data sets and a variety of evaluation metrics were used to evaluate machine learning algorithms at different taxonomic levels.
arXiv Detail & Related papers (2022-07-28T00:54:54Z) - Benchmarking Machine Learning Robustness in Covid-19 Genome Sequence
Classification [109.81283748940696]
We introduce several ways to perturb SARS-CoV-2 genome sequences to mimic the error profiles of common sequencing platforms such as Illumina and PacBio.
We show that some simulation-based approaches are more robust (and accurate) than others for specific embedding methods to certain adversarial attacks to the input sequences.
arXiv Detail & Related papers (2022-07-18T19:16:56Z) - Multi-channel neural networks for predicting influenza A virus hosts and
antigenic types [3.1981440103815717]
A fast, accurate and low-cost method to predict the origin host and subtype of influenza viruses could help reduce virus transmission and benefit resource-poor areas.
We propose multi-channel neural networks to predict antigenic types and hosts of influenza A viruses with complete and partial protein sequences.
arXiv Detail & Related papers (2022-06-08T11:47:31Z) - Classification of Influenza Hemagglutinin Protein Sequences using
Convolutional Neural Networks [8.397189036839956]
This paper focuses on accurately predicting if an Influenza type A virus can infect specific hosts, and more specifically, Human, Avian and Swine hosts, using only the protein sequence of the HA gene.
We propose encoding the protein sequences into numerical signals using the Hydrophobicity Index and subsequently utilising a Convolutional Neural Network-based predictive model.
As the results show, the proposed model can distinguish HA protein sequences with high accuracy whenever the virus under investigation can infect Human, Avian or Swine hosts.
arXiv Detail & Related papers (2021-08-09T10:42:26Z) - A k-mer Based Approach for SARS-CoV-2 Variant Identification [55.78588835407174]
We show that preserving the order of the amino acids helps the underlying classifiers to achieve better performance.
We also show the importance of the different amino acids which play a key role in identifying variants and how they coincide with those reported by the USA's Centers for Disease Control and Prevention (CDC)
arXiv Detail & Related papers (2021-08-07T15:08:15Z) - Viral Pneumonia Screening on Chest X-ray Images Using Confidence-Aware
Anomaly Detection [86.81773672627406]
Clusters of viral pneumonia during a short period of time may be a harbinger of an outbreak or pandemic, like SARS, MERS, and recent COVID-19.
Rapid and accurate detection of viral pneumonia using chest X-ray can be significantly useful in large-scale screening and epidemic prevention.
Viral pneumonia often have diverse causes and exhibit notably different visual appearances on X-ray images.
arXiv Detail & Related papers (2020-03-27T11:32:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.