Machine Learning-Based Analysis of Ebola Virus' Impact on Gene
Expression in Nonhuman Primates
- URL: http://arxiv.org/abs/2401.08738v2
- Date: Mon, 22 Jan 2024 14:17:27 GMT
- Title: Machine Learning-Based Analysis of Ebola Virus' Impact on Gene
Expression in Nonhuman Primates
- Authors: Mostafa Rezapour, Muhammad Khalid Khan Niazi, Hao Lu, Aarthi
Narayanan, Metin Nafi Gurcan
- Abstract summary: This study introduces the Supervised Magnitude-Altitude Scoring (SMAS) methodology, a machine learning-based approach, for analyzing gene expression data obtained from nonhuman primates (NHPs) infected with Ebola virus (EBOV)
We utilize a comprehensive dataset of NanoString gene expression profiles from Ebola-infected NHPs, deploying the SMAS system for nuanced host-pathogen interaction analysis.
Key finding of our research is the identification of IFI6 and IFI27 as critical biomarkers, demonstrating exceptional predictive performance with 100% accuracy and Area Under the Curve (AUC) metrics in classifying various stages of
- Score: 3.842863644161241
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study introduces the Supervised Magnitude-Altitude Scoring (SMAS)
methodology, a machine learning-based approach, for analyzing gene expression
data obtained from nonhuman primates (NHPs) infected with Ebola virus (EBOV).
We utilize a comprehensive dataset of NanoString gene expression profiles from
Ebola-infected NHPs, deploying the SMAS system for nuanced host-pathogen
interaction analysis. SMAS effectively combines gene selection based on
statistical significance and expression changes, employing linear classifiers
such as logistic regression to accurately differentiate between RT-qPCR
positive and negative NHP samples. A key finding of our research is the
identification of IFI6 and IFI27 as critical biomarkers, demonstrating
exceptional predictive performance with 100% accuracy and Area Under the Curve
(AUC) metrics in classifying various stages of Ebola infection. Alongside IFI6
and IFI27, genes, including MX1, OAS1, and ISG15, were significantly
upregulated, highlighting their essential roles in the immune response to EBOV.
Our results underscore the efficacy of the SMAS method in revealing complex
genetic interactions and response mechanisms during EBOV infection. This
research provides valuable insights into EBOV pathogenesis and aids in
developing more precise diagnostic tools and therapeutic strategies to address
EBOV infection in particular and viral infection in general.
Related papers
- CNN-LSTM Hybrid Model for AI-Driven Prediction of COVID-19 Severity from Spike Sequences and Clinical Data [0.0]
We developed a CNN-LSTM hybrid model to predict COVID-19 severity using spike protein sequences and clinical data.<n>The model achieved an F1 score of 82.92%, ROC-AUC of 0.9084, precision of 83.56%, and recall of 82.85%.
arXiv Detail & Related papers (2025-05-29T16:20:54Z) - HR-VILAGE-3K3M: A Human Respiratory Viral Immunization Longitudinal Gene Expression Dataset for Systems Immunity [8.64940622146001]
Human Respiratory Viral Immunization LongitudinAl Gene Expression (HR-VILAGE-3K3M) repository integrates 14,136 RNA-seq profiles from 3,178 subjects across 66 studies encompassing over 2.56 million cells.<n>HR-VILAGE-3K3M is the largest longitudinal transcriptomic resource for human respiratory viral immunization.
arXiv Detail & Related papers (2025-05-19T19:37:49Z) - Neuromorphic Spiking Neural Network Based Classification of COVID-19 Spike Sequences [4.497217246897902]
We propose a neural network-based (NN) mechanism to perform an efficient analysis of the SARS-CoV-2 data.
In this paper, we introduce a pipeline that first converts the spike protein sequences into a fixed-length numerical representation and then uses Neuromorphic Spiking Neural Network to classify those sequences.
arXiv Detail & Related papers (2024-12-19T10:26:31Z) - Assessing Concordance between RNA-Seq and NanoString Technologies in Ebola-Infected Nonhuman Primates Using Machine Learning [0.0]
We compare RNA sequencing (RNA-Seq) and NanoString technologies for gene expression analysis in non-human primates infected with Ebola virus (EBOV)
A machine learning approach, using the Supervised Magnitude-Altitude Scoring (SMAS) method trained on NanoString data, identified OAS1 as a key marker for distinguishing RT-qPCR positive from negative samples.
OAS1 also achieved 100% accuracy in differentiating infected from uninfected samples using logistic regression, demonstrating its robustness across platforms.
arXiv Detail & Related papers (2024-10-30T20:21:20Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Evaluating unsupervised disentangled representation learning for genomic
discovery and disease risk prediction [0.0]
We consider multiple unsupervised learning methods for learning disentangled representations, namely autoencoders, VAE, beta-VAE, and FactorVAE.
We observed improvements in the number of genome-wide significant loci, heritability, and performance of polygenic risk scores for asthma and chronic obstructive pulmonary disease by using FactorVAE or beta-VAE.
arXiv Detail & Related papers (2023-07-17T23:28:59Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Scalable Pathogen Detection from Next Generation DNA Sequencing with
Deep Learning [3.8175773487333857]
We propose MG2Vec, a deep learning-based solution that uses the transformer network as its backbone.
We show that the proposed approach can help detect pathogens from uncurated, real-world clinical samples.
We provide a comprehensive evaluation of a novel representation learning framework for metagenome-based disease diagnostics with deep learning.
arXiv Detail & Related papers (2022-11-30T00:13:59Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - COVIDx-US -- An open-access benchmark dataset of ultrasound imaging data
for AI-driven COVID-19 analytics [116.6248556979572]
COVIDx-US is an open-access benchmark dataset of COVID-19 related ultrasound imaging data.
It consists of 93 lung ultrasound videos and 10,774 processed images of patients infected with SARS-CoV-2 pneumonia, non-SARS-CoV-2 pneumonia, as well as healthy control cases.
arXiv Detail & Related papers (2021-03-18T03:31:33Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Genome Sequence Classification for Animal Diagnostics with Graph
Representations and Deep Neural Networks [4.339839287869652]
Bovine Respiratory Disease Complex (BRDC) is a complex respiratory disease in cattle with multiple etiologies, including bacterial and viral.
Current animal disease diagnostics is based on traditional tests such as bacterial culture, serolog, and Polymerase Chain Reaction (PCR) tests.
We show that networks-based machine learning approaches can detect pathogen signature with up to 89.7% accuracy.
arXiv Detail & Related papers (2020-07-24T22:30:18Z) - EPGAT: Gene Essentiality Prediction With Graph Attention Networks [1.1602089225841632]
We propose EPGAT, an approach for essentiality prediction based on Graph Attention Networks (GATs)
Our model directly learns patterns of gene essentiality from PPI networks, integrating additional evidence from multiomics data encoded as node attributes.
We benchmarked EPGAT for four organisms, including humans, accurately predicting gene essentiality with AUC score ranging from 0.78 to 0.97.
arXiv Detail & Related papers (2020-07-19T13:47:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.