Prototype-based Interpretable Breast Cancer Prediction Models: Analysis and Challenges
- URL: http://arxiv.org/abs/2403.20260v3
- Date: Fri, 19 Jul 2024 14:28:52 GMT
- Title: Prototype-based Interpretable Breast Cancer Prediction Models: Analysis and Challenges
- Authors: Shreyasi Pathak, Jörg Schlötterer, Jeroen Veltman, Jeroen Geerdink, Maurice van Keulen, Christin Seifert,
- Abstract summary: We show the use of PEF-C in the context of breast cancer prediction using mammography.
Existing works on prototype-based models on breast cancer prediction using mammography have focused on improving the classification performance of prototype-based models.
We are the first to evaluate the quality of the mammography prototypes systematically using our PEF-C.
- Score: 2.2194851850560715
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models have achieved high performance in medical applications, however, their adoption in clinical practice is hindered due to their black-box nature. Self-explainable models, like prototype-based models, can be especially beneficial as they are interpretable by design. However, if the learnt prototypes are of low quality then the prototype-based models are as good as black-box. Having high quality prototypes is a pre-requisite for a truly interpretable model. In this work, we propose a prototype evaluation framework for coherence (PEF-C) for quantitatively evaluating the quality of the prototypes based on domain knowledge. We show the use of PEF-C in the context of breast cancer prediction using mammography. Existing works on prototype-based models on breast cancer prediction using mammography have focused on improving the classification performance of prototype-based models compared to black-box models and have evaluated prototype quality through anecdotal evidence. We are the first to go beyond anecdotal evidence and evaluate the quality of the mammography prototypes systematically using our PEF-C. Specifically, we apply three state-of-the-art prototype-based models, ProtoPNet, BRAIxProtoPNet++ and PIP-Net on mammography images for breast cancer prediction and evaluate these models w.r.t. i) classification performance, and ii) quality of the prototypes, on three public datasets. Our results show that prototype-based models are competitive with black-box models in terms of classification performance, and achieve a higher score in detecting ROIs. However, the quality of the prototypes are not yet sufficient and can be improved in aspects of relevance, purity and learning a variety of prototypes. We call the XAI community to systematically evaluate the quality of the prototypes to check their true usability in high stake decisions and improve such models further.
Related papers
- Interpretable Image Classification with Adaptive Prototype-based Vision Transformers [37.62530032165594]
We present ProtoViT, a method for interpretable image classification combining deep learning and case-based reasoning.
Our model integrates Vision Transformer (ViT) backbones into prototype based models, while offering spatially deformed prototypes.
Our experiments show that our model can generally achieve higher performance than the existing prototype based models.
arXiv Detail & Related papers (2024-10-28T04:33:28Z) - This Looks Better than That: Better Interpretable Models with ProtoPNeXt [14.28283868577614]
Prototypical-part models are a popular interpretable alternative to black-box deep learning models for computer vision.
We create a new framework for integrating components of prototypical-part models -- ProtoPNeXt.
arXiv Detail & Related papers (2024-06-20T18:54:27Z) - Opinion-Unaware Blind Image Quality Assessment using Multi-Scale Deep Feature Statistics [54.08757792080732]
We propose integrating deep features from pre-trained visual models with a statistical analysis model to achieve opinion-unaware BIQA (OU-BIQA)
Our proposed model exhibits superior consistency with human visual perception compared to state-of-the-art BIQA models.
arXiv Detail & Related papers (2024-05-29T06:09:34Z) - Mixture of Gaussian-distributed Prototypes with Generative Modelling for Interpretable and Trustworthy Image Recognition [15.685927265270085]
We present a new generative paradigm to learn prototype distributions, termed as Mixture of Gaussian-distributed Prototypes (MGProto)
MGProto achieves state-of-the-art image recognition and OoD detection performances, while providing encouraging interpretability results.
arXiv Detail & Related papers (2023-11-30T11:01:37Z) - An evaluation of GPT models for phenotype concept recognition [0.4715973318447338]
We examine the performance of the latest Generative Pre-trained Transformer (GPT) models for clinical phenotyping and phenotype annotation.
Our results show that, with an appropriate setup, these models can achieve state of the art performance.
arXiv Detail & Related papers (2023-09-29T12:06:55Z) - A Comprehensive Evaluation and Analysis Study for Chinese Spelling Check [53.152011258252315]
We show that using phonetic and graphic information reasonably is effective for Chinese Spelling Check.
Models are sensitive to the error distribution of the test set, which reflects the shortcomings of models.
The commonly used benchmark, SIGHAN, can not reliably evaluate models' performance.
arXiv Detail & Related papers (2023-07-25T17:02:38Z) - Representer Point Selection for Explaining Regularized High-dimensional
Models [105.75758452952357]
We introduce a class of sample-based explanations we term high-dimensional representers.
Our workhorse is a novel representer theorem for general regularized high-dimensional models.
We study the empirical performance of our proposed methods on three real-world binary classification datasets and two recommender system datasets.
arXiv Detail & Related papers (2023-05-31T16:23:58Z) - Investigating Ensemble Methods for Model Robustness Improvement of Text
Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases.
By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z) - Knowledge Distillation to Ensemble Global and Interpretable
Prototype-Based Mammogram Classification Models [20.16068689434846]
We propose BRAIxProtoPNet++, which adds interpretability to a global model by ensembling it with a prototype-based model.
We show that BRAIxProtoPNet++ has higher classification accuracy than SOTA global and prototype-based models.
arXiv Detail & Related papers (2022-09-26T05:04:15Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating
and Auditing Generative Models [95.8037674226622]
We introduce a 3-dimensional evaluation metric that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion.
Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity.
arXiv Detail & Related papers (2021-02-17T18:25:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.