Automatic Cough Analysis for Non-Small Cell Lung Cancer Detection
- URL: http://arxiv.org/abs/2507.19174v1
- Date: Fri, 25 Jul 2025 11:30:22 GMT
- Title: Automatic Cough Analysis for Non-Small Cell Lung Cancer Detection
- Authors: Chiara Giangregorio, Cristina Maria Licciardello, Vanja Miskovic, Leonardo Provenzano, Alessandra Laura Giulia Pedrocchi, Andra Diana Dumitrascu, Arsela Prelaj, Marina Chiara Garassino, Emilia Ambrosini, Simona Ferrante,
- Abstract summary: Early detection of non-small cell lung cancer (NSCLC) is critical for improving patient outcomes.<n>We explore the use of automatic cough analysis as a pre-screening tool for distinguishing between NSCLC patients and healthy controls.<n>Recordings were analyzed using machine learning techniques, such as support vector machine (SVM) and XGBoost.
- Score: 33.37223681850477
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Early detection of non-small cell lung cancer (NSCLC) is critical for improving patient outcomes, and novel approaches are needed to facilitate early diagnosis. In this study, we explore the use of automatic cough analysis as a pre-screening tool for distinguishing between NSCLC patients and healthy controls. Cough audio recordings were prospectively acquired from a total of 227 subjects, divided into NSCLC patients and healthy controls. The recordings were analyzed using machine learning techniques, such as support vector machine (SVM) and XGBoost, as well as deep learning approaches, specifically convolutional neural networks (CNN) and transfer learning with VGG16. To enhance the interpretability of the machine learning model, we utilized Shapley Additive Explanations (SHAP). The fairness of the models across demographic groups was assessed by comparing the performance of the best model across different age groups (less than or equal to 58y and higher than 58y) and gender using the equalized odds difference on the test set. The results demonstrate that CNN achieves the best performance, with an accuracy of 0.83 on the test set. Nevertheless, SVM achieves slightly lower performances (accuracy of 0.76 in validation and 0.78 in the test set), making it suitable in contexts with low computational power. The use of SHAP for SVM interpretation further enhances model transparency, making it more trustworthy for clinical applications. Fairness analysis shows slightly higher disparity across age (0.15) than gender (0.09) on the test set. Therefore, to strengthen our findings' reliability, a larger, more diverse, and unbiased dataset is needed -- particularly including individuals at risk of NSCLC and those in early disease stages.
Related papers
- Predicting Length of Stay in Neurological ICU Patients Using Classical Machine Learning and Neural Network Models: A Benchmark Study on MIMIC-IV [49.1574468325115]
This study explores multiple ML approaches for predicting LOS in ICU specifically for the patients with neurological diseases based on the MIMIC-IV dataset.<n>The evaluated models include classic ML algorithms (K-Nearest Neighbors, Random Forest, XGBoost and CatBoost) and Neural Networks (LSTM, BERT and Temporal Fusion Transformer)
arXiv Detail & Related papers (2025-05-23T14:06:42Z) - Artificial Intelligence-Driven Prognostic Classification of COVID-19 Using Chest X-rays: A Deep Learning Approach [0.0]
This study presents a high-accuracy deep learning model for classifying COVID-19 severity (Mild, Moderate, and Severe) using Chest X-ray images.<n>Our model achieved an average accuracy of 97%, with specificity of 99%, sensitivity of 87%, and an F1-score of 93.11%.<n>These results demonstrate the model's potential for real-world clinical applications.
arXiv Detail & Related papers (2025-03-17T15:27:21Z) - Urinary Tract Infection Detection in Digital Remote Monitoring: Strategies for Managing Participant-Specific Prediction Complexity [43.108040967674185]
Urinary tract infections (UTIs) are a significant health concern, particularly for people living with dementia (PLWD)<n>This study builds on previous work that utilised machine learning (ML) to detect UTIs in PLWD.
arXiv Detail & Related papers (2025-02-18T12:01:55Z) - Optimizing Mortality Prediction for ICU Heart Failure Patients: Leveraging XGBoost and Advanced Machine Learning with the MIMIC-III Database [1.5186937600119894]
Heart failure affects millions of people worldwide, significantly reducing quality of life and leading to high mortality rates.
Despite extensive research, the relationship between heart failure and mortality rates among ICU patients is not fully understood.
This study analyzed data from 1,177 patients over 18 years old from the MIMIC-III database, identified using ICD-9 codes.
arXiv Detail & Related papers (2024-09-03T07:57:08Z) - Enhanced Prediction of Ventilator-Associated Pneumonia in Patients with Traumatic Brain Injury Using Advanced Machine Learning Techniques [0.0]
Ventilator-associated pneumonia (VAP) in traumatic brain injury (TBI) patients poses a significant mortality risk.
Timely detection and prognostication of VAP in TBI patients are crucial to improve patient outcomes and alleviate the strain on healthcare resources.
We implemented six machine learning models using the MIMIC-III database.
arXiv Detail & Related papers (2024-08-02T09:44:18Z) - Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options.
The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z) - Symptom-based Machine Learning Models for the Early Detection of
COVID-19: A Narrative Review [0.0]
Machine learning models can analyze large datasets, incorporating patient-reported symptoms, clinical data, and medical imaging.
In this paper, we provide an overview of the landscape of symptoms-only machine learning models for predicting COVID-19, including their performance and limitations.
The review will also examine the performance of symptom-based models when compared to image-based models.
arXiv Detail & Related papers (2023-12-08T01:41:42Z) - Deep-Learning Tool for Early Identifying Non-Traumatic Intracranial
Hemorrhage Etiology based on CT Scan [40.51754649947294]
The deep learning model was developed with 1868 eligible NCCT scans with non-traumatic ICH collected between January 2011 and April 2018.
The model's diagnostic performance was compared with clinicians's performance.
The clinicians achieve significant improvements in the sensitivity, specificity, and accuracy of diagnoses of certain hemorrhage etiologies with proposed system augmentation.
arXiv Detail & Related papers (2023-02-02T08:45:17Z) - Diagnosis of Covid-19 Via Patient Breath Data Using Artificial
Intelligence [0.0]
This study aims to develop a point-of-care testing (POCT) system that can detect COVID-19 by detecting volatile organic compounds (VOCs) in a patient's exhaled breath.
294 breath samples were collected from 142 patients at Istanbul Medipol Mega Hospital between December 2020 and March 2021.
The Gradient Boosting algorithm provides 95% recall when predicting COVID-19 positive patients and 96% accuracy when predicting COVID-19 negative patients.
arXiv Detail & Related papers (2023-01-24T22:00:00Z) - On the explainability of hospitalization prediction on a large COVID-19
patient dataset [45.82374977939355]
We develop various AI models to predict hospitalization on a large (over 110$k$) cohort of COVID-19 positive-tested US patients.
Despite high data unbalance, the models reach average precision 0.96-0.98 (0.75-0.85), recall 0.96-0.98 (0.74-0.85), and $F_score 0.97-0.98 (0.79-0.83) on the non-hospitalized (or hospitalized) class.
arXiv Detail & Related papers (2021-10-28T10:23:38Z) - CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors
and Efficient Neural Networks [51.589769497681175]
The novel coronavirus (SARS-CoV-2) has led to a pandemic.
The current testing regime based on Reverse Transcription-Polymerase Chain Reaction for SARS-CoV-2 has been unable to keep up with testing demands.
We propose a framework called CovidDeep that combines efficient DNNs with commercially available WMSs for pervasive testing of the virus.
arXiv Detail & Related papers (2020-07-20T21:47:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.