Predicting Influenza A Viral Host Using PSSM and Word Embeddings
- URL: http://arxiv.org/abs/2201.01140v4
- Date: Sat, 18 Nov 2023 17:20:23 GMT
- Title: Predicting Influenza A Viral Host Using PSSM and Word Embeddings
- Authors: Yanhua Xu, Dominik Wojtczak
- Abstract summary: We use various machine learning models with features derived from the position-specific scoring matrix (PSSM) to infer the origin host of viruses.
The results show that the performance of the PSSM-based model reaches the MCC around 95%, and the F1 around 96%.
- Score: 5.067354030054702
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid mutation of the influenza virus threatens public health.
Reassortment among viruses with different hosts can lead to a fatal pandemic.
However, it is difficult to detect the original host of the virus during or
after an outbreak as influenza viruses can circulate between different species.
Therefore, early and rapid detection of the viral host would help reduce the
further spread of the virus. We use various machine learning models with
features derived from the position-specific scoring matrix (PSSM) and features
learned from word embedding and word encoding to infer the origin host of
viruses. The results show that the performance of the PSSM-based model reaches
the MCC around 95%, and the F1 around 96%. The MCC obtained using the model
with word embedding is around 96%, and the F1 is around 97%.
Related papers
- Opponent Shaping for Antibody Development [49.26728828005039]
Anti-viral therapies are typically designed to target only the current strains of a virus.
therapy-induced selective pressures act on viruses to drive the emergence of mutated strains, against which initial therapies have reduced efficacy.
We build on a computational model of binding between antibodies and viral antigens to implement a genetic simulation of viral evolutionary escape.
arXiv Detail & Related papers (2024-09-16T14:56:27Z) - Virus2Vec: Viral Sequence Classification Using Machine Learning [48.40285316053593]
We propose Virus2Vec, a feature-vector representation for viral sequences that enable machine learning models to identify viral hosts.
We empirically evaluate Virus2Vec on real-world spike sequences of Coronaviridae and rabies virus sequence data to predict the host.
Our results demonstrate that Virus2Vec outperforms the predictive accuracies of baseline and state-of-the-art methods.
arXiv Detail & Related papers (2023-04-24T08:17:16Z) - Dense Feature Memory Augmented Transformers for COVID-19 Vaccination
Search Classification [60.49594822215981]
This paper presents a classification model for detecting COVID-19 vaccination related search queries.
We propose a novel approach of considering dense features as memory tokens that the model can attend to.
We show that this new modeling approach enables a significant improvement to the Vaccine Search Insights (VSI) task.
arXiv Detail & Related papers (2022-12-16T13:57:41Z) - Efficient Cavity Searching for Gene Network of Influenza A Virus [8.690486131601075]
High order structures (cavities and cliques) of the gene network of influenza A virus reveal tight associations among viruses during evolution.
We propose a model named HyperSearch based on deep learning to search cavities in a computable complex network for influenza virus genetics.
arXiv Detail & Related papers (2022-11-05T16:24:55Z) - Dive into Machine Learning Algorithms for Influenza Virus Host Prediction with Hemagglutinin Sequences [4.289396744209968]
Influenza viruses mutate rapidly and can pose a threat to public health, especially to those in vulnerable groups.
Recently, there has been increasing interest in using machine learning algorithms to provide fast and accurate predictions for viral sequences.
In this study, real testing data sets and a variety of evaluation metrics were used to evaluate machine learning algorithms at different taxonomic levels.
arXiv Detail & Related papers (2022-07-28T00:54:54Z) - Accurate Virus Identification with Interpretable Raman Signatures by
Machine Learning [12.184128048998906]
We present a machine learning approach for analyzing Raman spectra of human and avian viruses.
A Convolutional Neural Network (CNN) classifier specifically designed for spectral data achieves very high accuracy for a variety of virus type or subtype identification tasks.
arXiv Detail & Related papers (2022-06-05T22:31:14Z) - Anti-virus Autobots: Predicting More Infectious Virus Variants for
Pandemic Prevention through Deep Learning [0.0]
More infectious virus variants can arise from rapid mutations in their proteins.
These variants can evade one's immune system and infect vaccinated individuals, lowering vaccine efficacy.
This project proposes Optimus PPIme - a deep learning approach to predict future, more infectious variants from an existing virus.
arXiv Detail & Related papers (2022-05-30T05:04:40Z) - PWM2Vec: An Efficient Embedding Approach for Viral Host Specification
from Coronavirus Spike Sequences [0.7340017786387767]
We study the different hosts which can be potential carriers and transmitters of deadly viruses to humans.
In coronaviruses, the surface (S) protein, or spike protein, is an important part of determining host specificity.
We propose a feature embedding based on the well-known position-weight matrix (PWM), which we call2Vec, and use to generate feature vectors from the spike protein sequences of coronaviruses.
arXiv Detail & Related papers (2022-01-06T23:25:54Z) - A k-mer Based Approach for SARS-CoV-2 Variant Identification [55.78588835407174]
We show that preserving the order of the amino acids helps the underlying classifiers to achieve better performance.
We also show the importance of the different amino acids which play a key role in identifying variants and how they coincide with those reported by the USA's Centers for Disease Control and Prevention (CDC)
arXiv Detail & Related papers (2021-08-07T15:08:15Z) - Cross-lingual Transfer Learning for COVID-19 Outbreak Alignment [90.12602012910465]
We train on Italy's early COVID-19 outbreak through Twitter and transfer to several other countries.
Our experiments show strong results with up to 0.85 Spearman correlation in cross-country predictions.
arXiv Detail & Related papers (2020-06-05T02:04:25Z) - Viral Pneumonia Screening on Chest X-ray Images Using Confidence-Aware
Anomaly Detection [86.81773672627406]
Clusters of viral pneumonia during a short period of time may be a harbinger of an outbreak or pandemic, like SARS, MERS, and recent COVID-19.
Rapid and accurate detection of viral pneumonia using chest X-ray can be significantly useful in large-scale screening and epidemic prevention.
Viral pneumonia often have diverse causes and exhibit notably different visual appearances on X-ray images.
arXiv Detail & Related papers (2020-03-27T11:32:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.