BiLSTM-VHP: BiLSTM-Powered Network for Viral Host Prediction
- URL: http://arxiv.org/abs/2509.11345v1
- Date: Sun, 14 Sep 2025 16:42:11 GMT
- Title: BiLSTM-VHP: BiLSTM-Powered Network for Viral Host Prediction
- Authors: Azher Ahmed Efat, Farzana Islam, Annajiat Alim Rasel, Munima Haque,
- Abstract summary: Recent outbreaks of SARS-CoV-2, Monkeypox and swine flu viruses have shown how these viruses can disrupt human life and cause death.<n>Fast and accurate predictions of the host from which the virus spreads can help prevent these diseases from spreading.<n>This work presents BiLSTM-VHP, a lightweight bidirectional long short-term memory (LSTM)-based architecture that can predict the host from the nucleotide sequence of orthohantavirus, rabies lyssavirus, and rotavirus A.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recorded history shows the long coexistence of humans and animals, suggesting it began much earlier. Despite some beneficial interdependence, many animals carry viral diseases that can spread to humans. These diseases are known as zoonotic diseases. Recent outbreaks of SARS-CoV-2, Monkeypox and swine flu viruses have shown how these viruses can disrupt human life and cause death. Fast and accurate predictions of the host from which the virus spreads can help prevent these diseases from spreading. This work presents BiLSTM-VHP, a lightweight bidirectional long short-term memory (LSTM)-based architecture that can predict the host from the nucleotide sequence of orthohantavirus, rabies lyssavirus, and rotavirus A with high accuracy. The proposed model works with nucleotide sequences of 400 bases in length and achieved a prediction accuracy of 89.62% for orthohantavirus, 96.58% for rotavirus A, and 77.22% for rabies lyssavirus outperforming previous studies. Moreover, performance of the model is assessed using the confusion matrix, F-1 score, precision, recall, microaverage AUC. In addition, we introduce three curated datasets of orthohantavirus, rotavirus A, and rabies lyssavirus containing 8,575, 95,197, and 22,052 nucleotide sequences divided into 9, 12, and 29 host classes, respectively. The codes and dataset are available at https://doi.org/10.17605/OSF.IO/ANFKR
Related papers
- Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Virus2Vec: Viral Sequence Classification Using Machine Learning [48.40285316053593]
We propose Virus2Vec, a feature-vector representation for viral sequences that enable machine learning models to identify viral hosts.
We empirically evaluate Virus2Vec on real-world spike sequences of Coronaviridae and rabies virus sequence data to predict the host.
Our results demonstrate that Virus2Vec outperforms the predictive accuracies of baseline and state-of-the-art methods.
arXiv Detail & Related papers (2023-04-24T08:17:16Z) - Dive into Machine Learning Algorithms for Influenza Virus Host Prediction with Hemagglutinin Sequences [4.289396744209968]
Influenza viruses mutate rapidly and can pose a threat to public health, especially to those in vulnerable groups.
Recently, there has been increasing interest in using machine learning algorithms to provide fast and accurate predictions for viral sequences.
In this study, real testing data sets and a variety of evaluation metrics were used to evaluate machine learning algorithms at different taxonomic levels.
arXiv Detail & Related papers (2022-07-28T00:54:54Z) - Benchmarking Machine Learning Robustness in Covid-19 Genome Sequence
Classification [109.81283748940696]
We introduce several ways to perturb SARS-CoV-2 genome sequences to mimic the error profiles of common sequencing platforms such as Illumina and PacBio.
We show that some simulation-based approaches are more robust (and accurate) than others for specific embedding methods to certain adversarial attacks to the input sequences.
arXiv Detail & Related papers (2022-07-18T19:16:56Z) - Accurate Virus Identification with Interpretable Raman Signatures by
Machine Learning [12.184128048998906]
We present a machine learning approach for analyzing Raman spectra of human and avian viruses.
A Convolutional Neural Network (CNN) classifier specifically designed for spectral data achieves very high accuracy for a variety of virus type or subtype identification tasks.
arXiv Detail & Related papers (2022-06-05T22:31:14Z) - Predicting Influenza A Viral Host Using PSSM and Word Embeddings [5.067354030054702]
We use various machine learning models with features derived from the position-specific scoring matrix (PSSM) to infer the origin host of viruses.
The results show that the performance of the PSSM-based model reaches the MCC around 95%, and the F1 around 96%.
arXiv Detail & Related papers (2022-01-04T14:05:49Z) - Towards Interpreting Zoonotic Potential of Betacoronavirus Sequences
With Attention [17.406451433347527]
We apply an attention-enhanced long-short-term memory (LSTM) deep neural net classifier to a highly conserved viral protein target to predict zoonotic potential across betacoronaviruses.
Analysis and visualization of attention at the sequence and structure-level features indicate possible association between important protein-protein interactions governing viral replication in zoonotic betacoronaviruses and zoonotic transmission.
arXiv Detail & Related papers (2021-08-18T10:11:11Z) - A k-mer Based Approach for SARS-CoV-2 Variant Identification [55.78588835407174]
We show that preserving the order of the amino acids helps the underlying classifiers to achieve better performance.
We also show the importance of the different amino acids which play a key role in identifying variants and how they coincide with those reported by the USA's Centers for Disease Control and Prevention (CDC)
arXiv Detail & Related papers (2021-08-07T15:08:15Z) - CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors
and Efficient Neural Networks [51.589769497681175]
The novel coronavirus (SARS-CoV-2) has led to a pandemic.
The current testing regime based on Reverse Transcription-Polymerase Chain Reaction for SARS-CoV-2 has been unable to keep up with testing demands.
We propose a framework called CovidDeep that combines efficient DNNs with commercially available WMSs for pervasive testing of the virus.
arXiv Detail & Related papers (2020-07-20T21:47:28Z) - Viral Pneumonia Screening on Chest X-ray Images Using Confidence-Aware
Anomaly Detection [86.81773672627406]
Clusters of viral pneumonia during a short period of time may be a harbinger of an outbreak or pandemic, like SARS, MERS, and recent COVID-19.
Rapid and accurate detection of viral pneumonia using chest X-ray can be significantly useful in large-scale screening and epidemic prevention.
Viral pneumonia often have diverse causes and exhibit notably different visual appearances on X-ray images.
arXiv Detail & Related papers (2020-03-27T11:32:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.