Pulmonologists-Level lung cancer detection based on standard blood test
results and smoking status using an explainable machine learning approach
- URL: http://arxiv.org/abs/2402.09596v1
- Date: Wed, 14 Feb 2024 22:00:57 GMT
- Title: Pulmonologists-Level lung cancer detection based on standard blood test
results and smoking status using an explainable machine learning approach
- Authors: Ricco Noel Hansen Flyckt, Louise Sjodsholm, Margrethe H{\o}stgaard
Bang Henriksen, Claus Lohman Brasen, Ali Ebrahimi, Ole Hilberg, Torben
Fr{\o}strup Hansen, Uffe Kock Wiil, Lars Henrik Jensen, Abdolrahman Peimankar
- Abstract summary: Lung cancer (LC) remains the primary cause of cancer-related mortality, largely due to late-stage diagnoses.
In recent years, machine learning has demonstrated considerable potential in healthcare by facilitating the detection of various diseases.
We developed an ML model based on dynamic ensemble selection (DES) for LC detection.
- Score: 2.545682175108217
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lung cancer (LC) remains the primary cause of cancer-related mortality,
largely due to late-stage diagnoses. Effective strategies for early detection
are therefore of paramount importance. In recent years, machine learning (ML)
has demonstrated considerable potential in healthcare by facilitating the
detection of various diseases. In this retrospective development and validation
study, we developed an ML model based on dynamic ensemble selection (DES) for
LC detection. The model leverages standard blood sample analysis and smoking
history data from a large population at risk in Denmark. The study includes all
patients examined on suspicion of LC in the Region of Southern Denmark from
2009 to 2018. We validated and compared the predictions by the DES model with
diagnoses provided by five pulmonologists. Among the 38,944 patients, 9,940 had
complete data of which 2,505 (25\%) had LC. The DES model achieved an area
under the roc curve of 0.77$\pm$0.01, sensitivity of 76.2\%$\pm$2.4\%,
specificity of 63.8\%$\pm$2.3\%, positive predictive value of 41.6\%$\pm$1.2\%,
and F\textsubscript{1}-score of 53.8\%$\pm$1.1\%. The DES model outperformed
all five pulmonologists, achieving a sensitivity 9\% higher than their average.
The model identified smoking status, age, total calcium levels, neutrophil
count, and lactate dehydrogenase as the most important factors for the
detection of LC. The results highlight the successful application of the ML
approach in detecting LC, surpassing pulmonologists' performance. Incorporating
clinical and laboratory data in future risk assessment models can improve
decision-making and facilitate timely referrals.
Related papers
- Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Detection of subclinical atherosclerosis by image-based deep learning on chest x-ray [86.38767955626179]
Deep-learning algorithm to predict coronary artery calcium (CAC) score was developed on 460 chest x-ray.
The diagnostic accuracy of the AICAC model assessed by the area under the curve (AUC) was the primary outcome.
arXiv Detail & Related papers (2024-03-27T16:56:14Z) - Performance of externally validated machine learning models based on
histopathology images for the diagnosis, classification, prognosis, or
treatment outcome prediction in female breast cancer: A systematic review [0.5792122879054292]
externally validated machine learning models for diagnosis, classification, prognosis, or treatment outcome prediction in female breast cancer.
Three studies externally validated ML models for diagnosis, 4 for classification, 2 for prognosis, and 1 for both classification and prognosis.
Most studies used Convolutional Neural Networks and one used logistic regression algorithms.
arXiv Detail & Related papers (2023-12-09T18:27:56Z) - Development and external validation of a lung cancer risk estimation
tool using gradient-boosting [3.200615329024819]
Lung cancer is a significant cause of mortality worldwide, emphasizing the importance of early detection for improved survival rates.
We propose a machine learning (ML) tool trained on data from the PLCO Cancer Screening Trial and validated on the NLST.
The developed ML tool provides a freely available web application for estimating the likelihood of developing lung cancer within five years.
arXiv Detail & Related papers (2023-08-23T15:25:17Z) - Artificial intelligence based prediction on lung cancer risk factors
using deep learning [0.0]
Capturing and defining symptoms at an early stage is one of the most difficult phases for patients.
We developed a model that can detect lung cancer with a remarkably high level of accuracy using the deep learning approach.
We found that our model achieved an accuracy of 94% and a minimum loss of 0.1%.
arXiv Detail & Related papers (2023-04-11T08:57:15Z) - A new methodology to predict the oncotype scores based on
clinico-pathological data with similar tumor profiles [0.0]
The Oncotype DX (ODX) test is a commercially available molecular test for breast cancer.
The aim of this study is to propose a novel methodology to assist physicians in their decision-making.
arXiv Detail & Related papers (2023-03-13T10:08:13Z) - Penalized Deep Partially Linear Cox Models with Application to CT Scans
of Lung Cancer Patients [42.09584755334577]
Lung cancer is a leading cause of cancer mortality globally, highlighting the importance of understanding its mortality risks to design effective therapies.
The National Lung Screening Trial (NLST) employed computed tomography texture analysis to quantify the mortality risks of lung cancer patients.
We propose a novel Penalized Deep Partially Linear Cox Model (Penalized DPLC), which incorporates the SCAD penalty to select important texture features and employs a deep neural network to estimate the nonparametric component of the model.
arXiv Detail & Related papers (2023-03-09T15:38:16Z) - Machine Learning-based Lung and Colon Cancer Detection using Deep
Feature Extraction and Ensemble Learning [0.9786690381850355]
We introduce a hybrid ensemble feature extraction model to efficiently identify lung and colon cancer.
It integrates deep feature extraction and ensemble learning with high-performance filtering for cancer image datasets.
Our model can detect lung, colon, and (lung and colon) cancer with accuracy rates of 99.05%, 100%, and 99.30%, respectively.
arXiv Detail & Related papers (2022-06-02T15:14:41Z) - Predicting COVID-19 Pneumonia Severity on Chest X-ray with Deep Learning [57.00601760750389]
We present a severity score prediction model for COVID-19 pneumonia for frontal chest X-ray images.
Such a tool can gauge severity of COVID-19 lung infections that can be used for escalation or de-escalation of care.
arXiv Detail & Related papers (2020-05-24T23:13:16Z) - Joint Prediction and Time Estimation of COVID-19 Developing Severe
Symptoms using Chest CT Scan [49.209225484926634]
We propose a joint classification and regression method to determine whether the patient would develop severe symptoms in the later time.
To do this, the proposed method takes into account 1) the weight for each sample to reduce the outliers' influence and explore the problem of imbalance classification.
Our proposed method yields 76.97% of accuracy for predicting the severe cases, 0.524 of the correlation coefficient, and 0.55 days difference for the converted time.
arXiv Detail & Related papers (2020-05-07T12:16:37Z) - Automated Quantification of CT Patterns Associated with COVID-19 from
Chest CT [48.785596536318884]
The proposed method takes as input a non-contrasted chest CT and segments the lesions, lungs, and lobes in three dimensions.
The method outputs two combined measures of the severity of lung and lobe involvement, quantifying both the extent of COVID-19 abnormalities and presence of high opacities.
Evaluation of the algorithm is reported on CTs of 200 participants (100 COVID-19 confirmed patients and 100 healthy controls) from institutions from Canada, Europe and the United States.
arXiv Detail & Related papers (2020-04-02T21:49:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.