Pan-infection Foundation Framework Enables Multiple Pathogen Prediction
- URL: http://arxiv.org/abs/2501.01462v1
- Date: Tue, 31 Dec 2024 14:34:53 GMT
- Title: Pan-infection Foundation Framework Enables Multiple Pathogen Prediction
- Authors: Lingrui Zhang, Haonan Wu, Nana Jin, Chenqing Zheng, Jize Xie, Qitai Cai, Jun Wang, Qin Cao, Xubin Zheng, Jiankun Wang, Lixin Cheng,
- Abstract summary: Here, we curate the largest infection host-response transcriptome data, including 11,247 samples across 89 blood transcriptome datasets from 13 countries and 21 platforms.
We build a diagnostic model for pathogen prediction starting from a pan-infection model as foundation (AUC = 0.97) based on the pan-infection dataset.
We utilize knowledge distillation to efficiently transfer the insights from this "teacher" model to four lightweight pathogen "student" models, i.e., staphylococcal infection (AUC = 0.99), streptococcal infection (AUC = 0.94), HIV infection (AUC = 0.93), and RSV infection (AUC =
- Score: 6.4302271133357145
- License:
- Abstract: Host-response-based diagnostics can improve the accuracy of diagnosing bacterial and viral infections, thereby reducing inappropriate antibiotic prescriptions. However, the existing cohorts with limited sample size and coarse infections types are unable to support the exploration of an accurate and generalizable diagnostic model. Here, we curate the largest infection host-response transcriptome data, including 11,247 samples across 89 blood transcriptome datasets from 13 countries and 21 platforms. We build a diagnostic model for pathogen prediction starting from a pan-infection model as foundation (AUC = 0.97) based on the pan-infection dataset. Then, we utilize knowledge distillation to efficiently transfer the insights from this "teacher" model to four lightweight pathogen "student" models, i.e., staphylococcal infection (AUC = 0.99), streptococcal infection (AUC = 0.94), HIV infection (AUC = 0.93), and RSV infection (AUC = 0.94), as well as a sepsis "student" model (AUC = 0.99). The proposed knowledge distillation framework not only facilitates the diagnosis of pathogens using pan-infection data, but also enables an across-disease study from pan-infection to sepsis. Moreover, the framework enables high-degree lightweight design of diagnostic models, which is expected to be adaptively deployed in clinical settings.
Related papers
- TBBC: Predict True Bacteraemia in Blood Cultures via Deep Learning [0.0]
Bacteraemia, a bloodstream infection with high morbidity and mortality rates, poses significant diagnostic challenges.
This thesis aims to identify optimal machine learning techniques for predicting bacteraemia and develop a predictive model using data from St. Antonius Hospital's emergency department.
arXiv Detail & Related papers (2024-10-25T05:25:01Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Advancing Diagnostic Precision: Leveraging Machine Learning Techniques
for Accurate Detection of Covid-19, Pneumonia, and Tuberculosis in Chest
X-Ray Images [0.0]
Lung diseases such as COVID-19, tuberculosis (TB), and pneumonia continue to be serious global health concerns.
Paramedics and scientists are working intensively to create a reliable and precise approach for early-stage COVID-19 diagnosis.
arXiv Detail & Related papers (2023-10-09T18:38:49Z) - Differentiating Viral and Bacterial Infections: A Machine Learning Model Based on Routine Blood Test Values [0.0]
The "Virus vs. Bacteria" model paves the way for advanced diagnostic tools, leveraging machine learning to optimize infection management.
The model achieved an accuracy of 82.2 %, a sensitivity of 79.7 %, a specificity of 84.5 %, a Brier score of 0.129, and an area under the ROC curve (AUC) of 0.905, outperforming a CRP-based decision rule.
arXiv Detail & Related papers (2023-05-13T09:18:51Z) - Learning to diagnose cirrhosis from radiological and histological labels
with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset.
We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis.
This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z) - Joint Application of the Target Trial Causal Framework and Machine
Learning Modeling to Optimize Antibiotic Therapy: Use Case on Acute Bacterial
Skin and Skin Structure Infections due to Methicillin-resistant
Staphylococcus aureus [5.611469725376418]
We develop a machine learning model of mortality prediction and ITE estimation for patients with acute bacterial skin and skin structure infection (ABSSSI) due to methicillin-resistant Staphylococcus aureus (MRSA)
First, we use propensity score matching to emulate the trial and create a treatment randomized (vancomycin vs. other antibiotics) dataset.
Next, we use this data to train various machine learning methods (including boosted/LASSO logistic regression, support vector machines, and random forest) and choose the best model in terms of area under the receiver characteristic (AUC) through bootstrap validation.
arXiv Detail & Related papers (2022-07-15T13:08:15Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - Classification supporting COVID-19 diagnostics based on patient survey
data [82.41449972618423]
logistic regression and XGBoost classifiers, that allow for effective screening of patients for COVID-19 were generated.
The obtained classification models provided the basis for the DECODE service (decode.polsl.pl), which can serve as support in screening patients with COVID-19 disease.
This data set consists of more than 3,000 examples is based on questionnaires collected at a hospital in Poland.
arXiv Detail & Related papers (2020-11-24T17:44:01Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - JCS: An Explainable COVID-19 Diagnosis System by Joint Classification
and Segmentation [95.57532063232198]
coronavirus disease 2019 (COVID-19) has caused a pandemic disease in over 200 countries.
To control the infection, identifying and separating the infected people is the most crucial step.
This paper develops a novel Joint Classification and (JCS) system to perform real-time and explainable COVID-19 chest CT diagnosis.
arXiv Detail & Related papers (2020-04-15T12:30:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.