Gradient Boosting Decision Trees on Medical Diagnosis over Tabular Data
- URL: http://arxiv.org/abs/2410.03705v2
- Date: Fri, 11 Oct 2024 15:30:36 GMT
- Title: Gradient Boosting Decision Trees on Medical Diagnosis over Tabular Data
- Authors: A. Yarkın Yıldız, Asli Kalayci,
- Abstract summary: Ensemble methods offer a powerful alternative in terms of providing successful medical decision-making processes.
In this study, we investigated the benefits of ensemble methods, especially the Gradient Boosting Decision Tree (GBDT) algorithms in medical classification tasks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical diagnosis is a crucial task in the medical field, in terms of providing accurate classification and respective treatments. Having near-precise decisions based on correct diagnosis can affect a patient's life itself, and may extremely result in a catastrophe if not classified correctly. Several traditional machine learning (ML), such as support vector machines (SVMs) and logistic regression, and state-of-the-art tabular deep learning (DL) methods, including TabNet and TabTransformer, have been proposed and used over tabular medical datasets. Additionally, due to the superior performances, lower computational costs, and easier optimization over different tasks, ensemble methods have been used in the field more recently. They offer a powerful alternative in terms of providing successful medical decision-making processes in several diagnosis tasks. In this study, we investigated the benefits of ensemble methods, especially the Gradient Boosting Decision Tree (GBDT) algorithms in medical classification tasks over tabular data, focusing on XGBoost, CatBoost, and LightGBM. The experiments demonstrate that GBDT methods outperform traditional ML and deep neural network architectures and have the highest average rank over several benchmark tabular medical diagnosis datasets. Furthermore, they require much less computational power compared to DL models, creating the optimal methodology in terms of high performance and lower complexity.
Related papers
- MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models [49.765466293296186]
Recent progress in Medical Large Vision-Language Models (Med-LVLMs) has opened up new possibilities for interactive diagnostic tools.
Med-LVLMs often suffer from factual hallucination, which can lead to incorrect diagnoses.
We propose a versatile multimodal RAG system, MMed-RAG, designed to enhance the factuality of Med-LVLMs.
arXiv Detail & Related papers (2024-10-16T23:03:27Z) - XAI for In-hospital Mortality Prediction via Multimodal ICU Data [57.73357047856416]
We propose an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data.
We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions.
Our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
arXiv Detail & Related papers (2023-12-29T14:28:04Z) - Data Augmentation-Based Unsupervised Domain Adaptation In Medical
Imaging [0.709016563801433]
We propose an unsupervised method for robust domain adaptation in brain MRI segmentation by leveraging MRI-specific augmentation techniques.
The results show that our proposed approach achieves high accuracy, exhibits broad applicability, and showcases remarkable robustness against domain shift in various tasks.
arXiv Detail & Related papers (2023-08-08T17:00:11Z) - M-VAAL: Multimodal Variational Adversarial Active Learning for
Downstream Medical Image Analysis Tasks [16.85572580186212]
Active learning attempts to minimize the need for large annotated samples by actively sampling the most informative examples for annotation.
We propose a Multimodal Variational Adversarial Active Learning (M-VAAL) method that uses auxiliary information from additional modalities to enhance the active sampling.
arXiv Detail & Related papers (2023-06-21T16:40:37Z) - Interpretable Medical Diagnostics with Structured Data Extraction by
Large Language Models [59.89454513692417]
Tabular data is often hidden in text, particularly in medical diagnostic reports.
We propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM.
We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics.
arXiv Detail & Related papers (2023-06-08T09:12:28Z) - MMLN: Leveraging Domain Knowledge for Multimodal Diagnosis [10.133715767542386]
We propose a knowledge-driven and data-driven framework for lung disease diagnosis.
We formulate diagnosis rules according to authoritative clinical medicine guidelines and learn the weights of rules from text data.
A multimodal fusion consisting of text and image data is designed to infer the marginal probability of lung disease.
arXiv Detail & Related papers (2022-02-09T04:12:30Z) - Cross-Modality Deep Feature Learning for Brain Tumor Segmentation [158.8192041981564]
This paper proposes a novel cross-modality deep feature learning framework to segment brain tumors from the multi-modality MRI data.
The core idea is to mine rich patterns across the multi-modality data to make up for the insufficient data scale.
Comprehensive experiments are conducted on the BraTS benchmarks, which show that the proposed cross-modality deep feature learning framework can effectively improve the brain tumor segmentation performance.
arXiv Detail & Related papers (2022-01-07T07:46:01Z) - Multilevel Stochastic Optimization for Imputation in Massive Medical Data Records [6.711824170437793]
We apply a recently developed multi-level computational optimization approach to the problem of imputation in massive medical records.
Results show that the multi-level method significantly outperforms current approaches and is numerically robust.
arXiv Detail & Related papers (2021-10-19T01:14:08Z) - Machine Learning Approaches for Inferring Liver Diseases and Detecting
Blood Donors from Medical Diagnosis [0.0]
In this paper, multiple imputation by chained equations was applied to deal with missing data.
To reveal significant findings, data visualizations were implemented.
Our proposed ML-method showed better accuracy score (e.g. 98.23% for SVM)
arXiv Detail & Related papers (2021-04-25T04:10:19Z) - Learning Binary Semantic Embedding for Histology Image Classification
and Retrieval [56.34863511025423]
We propose a novel method for Learning Binary Semantic Embedding (LBSE)
Based on the efficient and effective embedding, classification and retrieval are performed to provide interpretable computer-assisted diagnosis for histology images.
Experiments conducted on three benchmark datasets validate the superiority of LBSE under various scenarios.
arXiv Detail & Related papers (2020-10-07T08:36:44Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.