A robust and lightweight deep attention multiple instance learning
algorithm for predicting genetic alterations
- URL: http://arxiv.org/abs/2206.00455v1
- Date: Tue, 31 May 2022 15:45:29 GMT
- Title: A robust and lightweight deep attention multiple instance learning
algorithm for predicting genetic alterations
- Authors: Bangwei Guo, Xingyu Li, Miaomiao Yang, Hong Zhang, Xu Steven Xu
- Abstract summary: We propose a novel Attention-based Multiple Instance Mutation Learning (AMIML) model for predicting gene mutations.
AMIML was comprised of successive 1-D convolutional layers, a decoder, and a residual weight connection to facilitate further integration of a lightweight attention mechanism.
AMIML demonstrated excellent robustness, not only outperforming all the five baseline algorithms in the vast majority of the tested genes, but also providing near-best-performance for the other seven genes.
- Score: 4.674211520843232
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep-learning models based on whole-slide digital pathology images (WSIs)
become increasingly popular for predicting molecular biomarkers. Instance-based
models has been the mainstream strategy for predicting genetic alterations
using WSIs although bag-based models along with self-attention mechanism-based
algorithms have been proposed for other digital pathology applications. In this
paper, we proposed a novel Attention-based Multiple Instance Mutation Learning
(AMIML) model for predicting gene mutations. AMIML was comprised of successive
1-D convolutional layers, a decoder, and a residual weight connection to
facilitate further integration of a lightweight attention mechanism to detect
the most predictive image patches. Using data for 24 clinically relevant genes
from four cancer cohorts in The Cancer Genome Atlas (TCGA) studies (UCEC, BRCA,
GBM and KIRC), we compared AMIML with one popular instance-based model and four
recently published bag-based models (e.g., CHOWDER, HE2RNA, etc.). AMIML
demonstrated excellent robustness, not only outperforming all the five baseline
algorithms in the vast majority of the tested genes (17 out of 24), but also
providing near-best-performance for the other seven genes. Conversely, the
performance of the baseline published algorithms varied across different
cancers/genes. In addition, compared to the published models for genetic
alterations, AMIML provided a significant improvement for predicting a wide
range of genes (e.g., KMT2C, TP53, and SETD2 for KIRC; ERBB2, BRCA1, and BRCA2
for BRCA; JAK1, POLE, and MTOR for UCEC) as well as produced outstanding
predictive models for other clinically relevant gene mutations, which have not
been reported in the current literature. Furthermore, with the flexible and
interpretable attention-based MIL pooling mechanism, AMIML could further
zero-in and detect predictive image patches.
Related papers
- Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification [119.13058298388101]
We develop a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances.
BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules.
BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules.
arXiv Detail & Related papers (2024-06-05T06:42:27Z) - Path-GPTOmic: A Balanced Multi-modal Learning Framework for Survival Outcome Prediction [14.204637932937082]
We introduce a new multi-modal Path-GPTOmic" framework for cancer survival outcome prediction.
We regulate the embedding space of a foundation model, scGPT, initially trained on single-cell RNA-seq data.
We propose a gradient modulation mechanism tailored to the Cox partial likelihood loss for survival prediction.
arXiv Detail & Related papers (2024-03-18T00:02:48Z) - A Hybrid Machine Learning Model for Classifying Gene Mutations in Cancer using LSTM, BiLSTM, CNN, GRU, and GloVe [0.0]
We introduce a novel hybrid ensemble model that synergistically combines LSTM, BiLSTM, CNN, GRU, and GloVe embeddings for the classification of gene mutations in cancer.
Our approach achieved a training accuracy of 80.6%, precision of 81.6%, recall of 80.6%, and an F1 score of 83.1%, alongside a significantly reduced Mean Squared Error (MSE) of 2.596.
arXiv Detail & Related papers (2023-07-24T21:01:46Z) - Cancer-inspired Genomics Mapper Model for the Generation of Synthetic
DNA Sequences with Desired Genomics Signatures [0.0]
Cancer-inspired genomics mapper model (CGMM) combines genetic algorithm (GA) and deep learning (DL) methods.
We demonstrate that CGMM can generate synthetic genomes of selected phenotypes such as ancestry and cancer.
arXiv Detail & Related papers (2023-05-01T07:16:40Z) - Unsupervised ensemble-based phenotyping helps enhance the
discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles.
It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner.
These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z) - Benchmarking Machine Learning Robustness in Covid-19 Genome Sequence
Classification [109.81283748940696]
We introduce several ways to perturb SARS-CoV-2 genome sequences to mimic the error profiles of common sequencing platforms such as Illumina and PacBio.
We show that some simulation-based approaches are more robust (and accurate) than others for specific embedding methods to certain adversarial attacks to the input sequences.
arXiv Detail & Related papers (2022-07-18T19:16:56Z) - Incorporating intratumoral heterogeneity into weakly-supervised deep
learning models via variance pooling [5.606290756924216]
Supervised learning tasks such as cancer survival prediction from gigapixel whole slide images (WSIs) are a critical challenge in computational pathology.
We develop a novel variance pooling architecture that enables a MIL model to incorporate intratumoral heterogeneity into its predictions.
An empirical study with 4,479 gigapixel WSIs from the Cancer Genome Atlas shows that adding variance pooling onto MIL frameworks improves survival prediction performance for five cancer types.
arXiv Detail & Related papers (2022-06-17T16:35:35Z) - rfPhen2Gen: A machine learning based association study of brain imaging
phenotypes to genotypes [71.1144397510333]
We learned machine learning models to predict SNPs using 56 brain imaging QTs.
SNPs within the known Alzheimer disease (AD) risk gene APOE had lowest RMSE for lasso and random forest.
Random forests identified additional SNPs that were not prioritized by the linear models but are known to be associated with brain-related disorders.
arXiv Detail & Related papers (2022-03-31T20:15:22Z) - Optimize Deep Learning Models for Prediction of Gene Mutations Using
Unsupervised Clustering [6.494144125433731]
Deep learning has become the mainstream methodological choice for analyzing and interpreting whole-slide digital pathology images.
In this paper, we proposed an unsupervised clustering-based multiple-instance learning, and apply our method to develop deep-learning models for prediction of gene mutations using WSIs from three cancer types.
We showed that unsupervised clustering of image patches could help identify predictive patches, exclude patches lack of predictive information, and therefore improve prediction on gene mutations in all three different cancer types.
arXiv Detail & Related papers (2022-03-31T11:48:21Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.