Biomarker based Cancer Classification using an Ensemble with Pre-trained Models
- URL: http://arxiv.org/abs/2406.10087v1
- Date: Fri, 14 Jun 2024 14:43:59 GMT
- Title: Biomarker based Cancer Classification using an Ensemble with Pre-trained Models
- Authors: Chongmin Lee, Jihie Kim,
- Abstract summary: We propose a novel ensemble model combining pre-trained Hyperfast model, XGBoost, and LightGBM for multi-class classification tasks.
We leverage a meta-trained Hyperfast model for classifying cancer, accomplishing the highest AUC of 0.9929.
We also propose a novel ensemble model combining pre-trained Hyperfast model, XGBoost, and LightGBM for multi-class classification tasks, achieving an incremental increase in accuracy (0.9464)
- Score: 2.2436844508175224
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Certain cancer types, namely pancreatic cancer is difficult to detect at an early stage; sparking the importance of discovering the causal relationship between biomarkers and cancer to identify cancer efficiently. By allowing for the detection and monitoring of specific biomarkers through a non-invasive method, liquid biopsies enhance the precision and efficacy of medical interventions, advocating the move towards personalized healthcare. Several machine learning algorithms such as Random Forest, SVM are utilized for classification, yet causing inefficiency due to the need for conducting hyperparameter tuning. We leverage a meta-trained Hyperfast model for classifying cancer, accomplishing the highest AUC of 0.9929 and simultaneously achieving robustness especially on highly imbalanced datasets compared to other ML algorithms in several binary classification tasks (e.g. breast invasive carcinoma; BRCA vs. non-BRCA). We also propose a novel ensemble model combining pre-trained Hyperfast model, XGBoost, and LightGBM for multi-class classification tasks, achieving an incremental increase in accuracy (0.9464) while merely using 500 PCA features; distinguishable from previous studies where they used more than 2,000 features for similar results.
Related papers
- Multi-modal Medical Image Fusion For Non-Small Cell Lung Cancer Classification [7.002657345547741]
Non-small cell lung cancer (NSCLC) is a predominant cause of cancer mortality worldwide.
In this paper, we introduce an innovative integration of multi-modal data, synthesizing fused medical imaging (CT and PET scans) with clinical health records and genomic data.
Our research surpasses existing approaches, as evidenced by a substantial enhancement in NSCLC detection and classification precision.
arXiv Detail & Related papers (2024-09-27T12:59:29Z) - Boosting Medical Image-based Cancer Detection via Text-guided Supervision from Reports [68.39938936308023]
We propose a novel text-guided learning method to achieve highly accurate cancer detection results.
Our approach can leverage clinical knowledge by large-scale pre-trained VLM to enhance generalization ability.
arXiv Detail & Related papers (2024-05-23T07:03:38Z) - Metastatic Breast Cancer Prognostication Through Multimodal Integration
of Dimensionality Reduction Algorithms and Classification Algorithms [0.0]
The study focuses on the detection of metastatic cancer using Machine learning (ML)
The highest accuracy of 71.14% was produced by the ML pipeline comprising of PCA, the genetic algorithm, and the k-nearest neighbors algorithm.
arXiv Detail & Related papers (2023-09-19T05:12:02Z) - Cancer-Net BCa-S: Breast Cancer Grade Prediction using Volumetric Deep
Radiomic Features from Synthetic Correlated Diffusion Imaging [82.74877848011798]
The prevalence of breast cancer continues to grow, affecting about 300,000 females in the United States in 2023.
The gold-standard Scarff-Bloom-Richardson (SBR) grade has been shown to consistently indicate a patient's response to chemotherapy.
In this paper, we study the efficacy of deep learning for breast cancer grading based on synthetic correlated diffusion (CDI$s$) imaging.
arXiv Detail & Related papers (2023-04-12T15:08:34Z) - Regression-based Deep-Learning predicts molecular biomarkers from
pathology slides [40.24757332810004]
We developed and evaluated a new self-supervised attention-based weakly supervised regression method that predicts continuous biomarkers directly from images.
Using regression significantly enhances the accuracy of biomarker prediction, while also improving the interpretability of the results over classification.
Our open-source regression approach offers a promising alternative for continuous biomarker analysis in computational pathology.
arXiv Detail & Related papers (2023-04-11T11:43:51Z) - A Combined PCA-MLP Network for Early Breast Cancer Detection [0.0]
We have studied different machine learning algorithms to detect whether a patient is likely to face breast cancer or not.
Our 4 layers-PCA network has obtained the best accuracy of 100% with a mean of 90.48% on the BCCD dataset.
arXiv Detail & Related papers (2022-06-18T06:17:40Z) - Multi-Scale Hybrid Vision Transformer for Learning Gastric Histology:
AI-Based Decision Support System for Gastric Cancer Treatment [50.89811515036067]
Gastric endoscopic screening is an effective way to decide appropriate gastric cancer (GC) treatment at an early stage, reducing GC-associated mortality rate.
We propose a practical AI system that enables five subclassifications of GC pathology, which can be directly matched to general GC treatment guidance.
arXiv Detail & Related papers (2022-02-17T08:33:52Z) - EMT-NET: Efficient multitask network for computer-aided diagnosis of
breast cancer [58.720142291102135]
We propose an efficient and light-weighted learning architecture to classify and segment breast tumors simultaneously.
We incorporate a segmentation task into a tumor classification network, which makes the backbone network learn representations focused on tumor regions.
The accuracy, sensitivity, and specificity of tumor classification is 88.6%, 94.1%, and 85.3%, respectively.
arXiv Detail & Related papers (2022-01-13T05:24:40Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - A Novel Self-Learning Framework for Bladder Cancer Grading Using
Histopathological Images [1.244681179922733]
We present a self-learning framework to grade bladder cancer from histological images stained viachemical techniques.
We propose a novel Deep Convolutional Embedded Attention Clustering (DCEAC) which allows classifying histological patches into different levels of the disease.
arXiv Detail & Related papers (2021-06-25T11:04:04Z) - Cancer Gene Profiling through Unsupervised Discovery [49.28556294619424]
We introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers.
Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm.
Our signature reports promising results on distinguishing immune inflammatory and immune desert tumors.
arXiv Detail & Related papers (2021-02-11T09:04:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.