Tuberculosis Screening from Cough Audio: Baseline Models, Clinical Variables, and Uncertainty Quantification
- URL: http://arxiv.org/abs/2601.07969v1
- Date: Mon, 12 Jan 2026 20:04:05 GMT
- Title: Tuberculosis Screening from Cough Audio: Baseline Models, Clinical Variables, and Uncertainty Quantification
- Authors: George P. Kafentzis, Efstratios Selisios,
- Abstract summary: We propose a standardized framework for automatic tuberculosis (TB) detection from cough audio and routinely collected clinical data using machine learning.<n>Existing studies vary substantially in datasets, cohort definitions, feature representations, model families, validation protocols, and reported metrics.<n>We establish a strong, well-documented baseline for TB prediction using cough recordings and accompanying clinical metadata from a recently compiled dataset from several countries.
- Score: 0.6015898117103067
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a standardized framework for automatic tuberculosis (TB) detection from cough audio and routinely collected clinical data using machine learning. While TB screening from audio has attracted growing interest, progress is difficult to measure because existing studies vary substantially in datasets, cohort definitions, feature representations, model families, validation protocols, and reported metrics. Consequently, reported gains are often not directly comparable, and it remains unclear whether improvements stem from modeling advances or from differences in data and evaluation. We address this gap by establishing a strong, well-documented baseline for TB prediction using cough recordings and accompanying clinical metadata from a recently compiled dataset from several countries. Our pipeline is reproducible end-to-end, covering feature extraction, multimodal fusion, cougher-independent evaluation, and uncertainty quantification, and it reports a consistent suite of clinically relevant metrics to enable fair comparison. We further quantify performance for cough audio-only and fused (audio + clinical metadata) models, and release the full experimental protocol to facilitate benchmarking. This baseline is intended to serve as a common reference point and to reduce methodological variance that currently holds back progress in the field.
Related papers
- Suppressing Prior-Comparison Hallucinations in Radiology Report Generation via Semantically Decoupled Latent Steering [94.37535002230504]
We develop a training-free, inference-time control framework termed Semantically Decoupled Latent Steering.<n>Our approach constructs a semantic-free intervention vector via large language model (LLM)-driven semantic decomposition.<n>We show that our approach significantly reduces the probability of historical hallucinations.
arXiv Detail & Related papers (2026-02-27T04:49:01Z) - Towards Trustworthy Breast Tumor Segmentation in Ultrasound using Monte Carlo Dropout and Deep Ensembles for Epistemic Uncertainty Estimation [0.01066386648660129]
We evaluate the use of a modified Residual U-Net for breast ultrasound segmentation.<n>We identify and correct for data duplication in the BUSI dataset.<n>Epistemic uncertainty is quantified using Monte Carlo dropout, deep ensembles, and their combination.
arXiv Detail & Related papers (2025-08-25T08:06:07Z) - MedPatch: Confidence-Guided Multi-Stage Fusion for Multimodal Clinical Data [0.46040036610482665]
Real-world medical data is heterogeneous in nature, limited in size, and sparse due to missing modalities.<n>Inspired by clinical prediction tasks, we introduce MedPatch, which seamlessly integrates multiple modalities via confidence-guided patching.<n>We evaluate MedPatch using real-world data consisting of clinical time-series data, chest X-ray images, radiology reports, and discharge notes extracted from the MIMIC-IV, MIMIC-CXR, and MIMIC-Notes datasets.
arXiv Detail & Related papers (2025-08-07T12:46:26Z) - Differential-UMamba: Rethinking Tumor Segmentation Under Limited Data Scenarios [3.1231963031043786]
We introduce Diff-UMamba, a novel architecture that combines the UNet framework with the mamba mechanism to model long-range dependencies.<n>At the heart of Diff-UMamba is a noise reduction module, which employs a signal differencing strategy to suppress noisy or irrelevant activations.<n>The architecture achieves improved segmentation accuracy and robustness, particularly in low-data settings.
arXiv Detail & Related papers (2025-07-24T08:23:11Z) - Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature [1.7779568951268254]
We introduce a novel methodology for voice pathology detection using the publicly available Saarbr"ucken Voice Database.<n>We evaluate six machine learning (ML) algorithms -- support vector machine, k-nearest neighbors, naive Bayes, decision tree, random forest, and AdaBoost.<n>Our approach 85.61%, 84.69% and 85.22% unweighted average recall (UAR) for females, males and combined results respectively.
arXiv Detail & Related papers (2024-10-14T14:17:52Z) - Meta-Learners for Partially-Identified Treatment Effects Across Multiple Environments [67.80453452949303]
Estimating the conditional average treatment effect (CATE) from observational data is relevant for many applications such as personalized medicine.
Here, we focus on the widespread setting where the observational data come from multiple environments.
We propose different model-agnostic learners (so-called meta-learners) to estimate the bounds that can be used in combination with arbitrary machine learning models.
arXiv Detail & Related papers (2024-06-04T16:31:43Z) - Tuning In: Analysis of Audio Classifier Performance in Clinical Settings with Limited Data [3.0113849517062303]
This study assesses deep learning models for audio classification in a clinical setting with the constraint of small datasets.
We analyze CNNs, including DenseNet and ConvNeXt, alongside transformer models like ViT, SWIN, and AST.
Our method highlights the benefits of pre-training on large datasets before fine-tuning on specific clinical data.
arXiv Detail & Related papers (2024-02-07T16:41:11Z) - Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report
Generation [47.250147322130545]
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images.
Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists.
We present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.
arXiv Detail & Related papers (2023-11-18T14:37:53Z) - LifeLonger: A Benchmark for Continual Disease Classification [59.13735398630546]
We introduce LifeLonger, a benchmark for continual disease classification on the MedMNIST collection.
Task and class incremental learning of diseases address the issue of classifying new samples without re-training the models from scratch.
Cross-domain incremental learning addresses the issue of dealing with datasets originating from different institutions while retaining the previously obtained knowledge.
arXiv Detail & Related papers (2022-04-12T12:25:05Z) - Harmonization with Flow-based Causal Inference [12.739380441313022]
This paper presents a normalizing-flow-based method to perform counterfactual inference upon a structural causal model (SCM) to harmonize medical data.
We evaluate on multiple, large, real-world medical datasets to observe that this method leads to better cross-domain generalization compared to state-of-the-art algorithms.
arXiv Detail & Related papers (2021-06-12T19:57:35Z) - Quantification of pulmonary involvement in COVID-19 pneumonia by means
of a cascade oftwo U-nets: training and assessment on multipledatasets using
different annotation criteria [83.83783947027392]
This study aims at exploiting Artificial intelligence (AI) for the identification, segmentation and quantification of COVID-19 pulmonary lesions.
We developed an automated analysis pipeline, the LungQuant system, based on a cascade of two U-nets.
The accuracy in predicting the CT-Severity Score (CT-SS) of the LungQuant system has been also evaluated.
arXiv Detail & Related papers (2021-05-06T10:21:28Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.