Deep learning generates custom-made logistic regression models for
explaining how breast cancer subtypes are classified
- URL: http://arxiv.org/abs/2001.06988v2
- Date: Tue, 19 Jul 2022 03:12:56 GMT
- Title: Deep learning generates custom-made logistic regression models for
explaining how breast cancer subtypes are classified
- Authors: Takuma Shibahara, Chisa Wada, Yasuho Yamashita, Kazuhiro Fujita,
Masamichi Sato, Junichi Kuwata, Atsushi Okamoto, and Yoshimasa Ono
- Abstract summary: We developed an explainable deep learning model called a point-wise linear (PWL) model that generates a custom-made logistic regression for each patient.
We trained the PWL model with RNA-seq data to predict PAM50 intrinsic subtypes and applied it to the 41/50 genes of PAM50 through the subtype prediction task.
- Score: 0.2529563359433233
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differentiating the intrinsic subtypes of breast cancer is crucial for
deciding the best treatment strategy. Deep learning can predict the subtypes
from genetic information more accurately than conventional statistical methods,
but to date, deep learning has not been directly utilized to examine which
genes are associated with which subtypes. To clarify the mechanisms embedded in
the intrinsic subtypes, we developed an explainable deep learning model called
a point-wise linear (PWL) model that generates a custom-made logistic
regression for each patient. Logistic regression, which is familiar to both
physicians and medical informatics researchers, allows us to analyze the
importance of the feature variables, and the PWL model harnesses these
practical abilities of logistic regression. In this study, we show that
analyzing breast cancer subtypes is clinically beneficial for patients and one
of the best ways to validate the capability of the PWL model. First, we trained
the PWL model with RNA-seq data to predict PAM50 intrinsic subtypes and applied
it to the 41/50 genes of PAM50 through the subtype prediction task. Second, we
developed a deep enrichment analysis method to reveal the relationships between
the PAM50 subtypes and the copy numbers of breast cancer. Our findings showed
that the PWL model utilized genes relevant to the cell cycle-related pathways.
These preliminary successes in breast cancer subtype analysis demonstrate the
potential of our analysis strategy to clarify the mechanisms underlying breast
cancer and improve overall clinical outcomes.
Related papers
- Deep learning-based classification of breast cancer molecular subtypes from H&E whole-slide images [0.0]
We investigated whether H&E-stained whole slide images could be leveraged to predict breast cancer molecular subtypes.
We used 1,433 WSIs of breast cancer in a two-step pipeline: first, classifying tumor and non-tumor tiles to use only the tumor regions for molecular subtyping.
The pipeline was tested on 221 hold-out WSIs, achieving an overall macro F1 score of 0.95 for tumor detection and 0.73 for molecular subtyping.
arXiv Detail & Related papers (2024-08-30T13:57:33Z) - Predicting Lung Cancer Patient Prognosis with Large Language Models [20.97970447748789]
Large language models (LLMs) have gained attention for their ability to process and generate text based on extensive learned knowledge.
We evaluate the potential of GPT-4o mini and GPT-3.5 in predicting the prognosis of lung cancer patients.
arXiv Detail & Related papers (2024-08-15T06:36:27Z) - Predictive Modeling for Breast Cancer Classification in the Context of Bangladeshi Patients: A Supervised Machine Learning Approach with Explainable AI [0.0]
We evaluate and compare the classification accuracy, precision, recall, and F-1 scores of five different machine learning methods.
XGBoost achieved the best model accuracy, which is 97%.
arXiv Detail & Related papers (2024-04-06T17:23:21Z) - Histopathologic Cancer Detection [0.0]
This work uses the PatchCamelyon benchmark datasets and trains them in a multi-layer perceptron and convolution model to observe the model's performance in terms of precision Recall, F1 Score, Accuracy, and AUC Score.
Also, this paper introduced ResNet50 and InceptionNet models with data augmentation, where ResNet50 is able to beat the state-of-the-art model.
arXiv Detail & Related papers (2023-11-13T19:51:46Z) - PACS: Prediction and analysis of cancer subtypes from multi-omics data
based on a multi-head attention mechanism model [2.275409158519155]
We propose a supervised multi-head attention mechanism model (SMA) to classify cancer subtypes successfully.
The attention mechanism and feature sharing module of the SMA model can successfully learn the global and local feature information of multi-omics data.
The SMA model achieves the highest accuracy, F1 macroscopic, F1 weighted, and accurate classification of cancer subtypes in simulated, single-cell, and cancer multiomics datasets.
arXiv Detail & Related papers (2023-08-21T03:54:21Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Benchmarking Machine Learning Robustness in Covid-19 Genome Sequence
Classification [109.81283748940696]
We introduce several ways to perturb SARS-CoV-2 genome sequences to mimic the error profiles of common sequencing platforms such as Illumina and PacBio.
We show that some simulation-based approaches are more robust (and accurate) than others for specific embedding methods to certain adversarial attacks to the input sequences.
arXiv Detail & Related papers (2022-07-18T19:16:56Z) - rfPhen2Gen: A machine learning based association study of brain imaging
phenotypes to genotypes [71.1144397510333]
We learned machine learning models to predict SNPs using 56 brain imaging QTs.
SNPs within the known Alzheimer disease (AD) risk gene APOE had lowest RMSE for lasso and random forest.
Random forests identified additional SNPs that were not prioritized by the linear models but are known to be associated with brain-related disorders.
arXiv Detail & Related papers (2022-03-31T20:15:22Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.