XOmiVAE: an interpretable deep learning model for cancer classification
using high-dimensional omics data
- URL: http://arxiv.org/abs/2105.12807v1
- Date: Wed, 26 May 2021 19:55:12 GMT
- Title: XOmiVAE: an interpretable deep learning model for cancer classification
using high-dimensional omics data
- Authors: Eloise Withnell, Xiaoyu Zhang, Kai Sun, Yike Guo
- Abstract summary: We present XOmiVAE, a novel interpretable deep learning model for cancer classification using high-dimensional omics data.
XOmiVAE is able to obtain contribution values of each gene and latent dimension for a specific prediction.
It is also revealed that XOmiVAE can explain both the supervised classification and the unsupervised clustering results from the deep learning network.
- Score: 17.697184123548503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning based approaches have proven promising to model omics data.
However, one of the current limitations compared to statistical and traditional
machine learning approaches is the lack of explainability, which not only
reduces the reliability, but limits the potential for acquiring novel knowledge
from unpicking the "black-box" models. Here we present XOmiVAE, a novel
interpretable deep learning model for cancer classification using
high-dimensional omics data. XOmiVAE is able to obtain contribution values of
each gene and latent dimension for a specific prediction, and the correlation
between genes and the latent dimensions. It is also revealed that XOmiVAE can
explain both the supervised classification and the unsupervised clustering
results from the deep learning network. To the best of our knowledge, XOmiVAE
is one of the first activated-based deep learning interpretation method to
explain novel clusters generated by variational autoencoders. The results
generated by XOmiVAE were validated by both the biomedical knowledge and the
performance of downstream tasks. XOmiVAE explanations of deep learning based
cancer classification and clustering aligned with current domain knowledge
including biological annotation and literature, which shows great potential for
novel biomedical knowledge discovery from deep learning models. The top XOmiVAE
selected genes and dimensions shown significant influence to the performance of
cancer classification. Additionally, we offer important steps to consider when
interpreting deep learning models for tumour classification. For instance, we
demonstrate the importance of choosing background samples that makes biological
sense and the limitations of connection weight based methods to explain latent
dimensions.
Related papers
- Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - A Comparative Analysis of Gene Expression Profiling by Statistical and
Machine Learning Approaches [1.8954222800767324]
We discuss the biological and the methodological limitations of machine learning models to classify cancer samples.
Gene rankings are obtained from explainability methods adapted to these models.
We observe that the information learned by black-box neural networks is related to the notion of differential expression.
arXiv Detail & Related papers (2024-02-01T18:17:36Z) - EvalAttAI: A Holistic Approach to Evaluating Attribution Maps in Robust
and Non-Robust Models [0.3425341633647624]
This paper focuses on evaluating methods of attribution mapping to find whether robust neural networks are more explainable.
We propose a new explainability faithfulness metric (called EvalAttAI) that addresses the limitations of prior metrics.
arXiv Detail & Related papers (2023-03-15T18:33:22Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Continual Learning with Bayesian Model based on a Fixed Pre-trained
Feature Extractor [55.9023096444383]
Current deep learning models are characterised by catastrophic forgetting of old knowledge when learning new classes.
Inspired by the process of learning new knowledge in human brains, we propose a Bayesian generative model for continual learning.
arXiv Detail & Related papers (2022-04-28T08:41:51Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z) - An Investigation of Interpretability Techniques for Deep Learning in
Predictive Process Analytics [2.162419921663162]
This paper explores interpretability techniques for two of the most successful learning algorithms in medical decision-making literature: deep neural networks and random forests.
We learn models that try to predict the type of cancer of the patient, given their set of medical activity records.
We see certain distinct features used for predictions that provide useful insights about the type of cancer, along with features that do not generalize well.
arXiv Detail & Related papers (2020-02-21T09:14:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.