Methodological Explainability Evaluation of an Interpretable Deep Learning Model for Post-Hepatectomy Liver Failure Prediction Incorporating Counterfactual Explanations and Layerwise Relevance Propagation: A Prospective In Silico Trial
- URL: http://arxiv.org/abs/2408.03771v1
- Date: Wed, 7 Aug 2024 13:47:32 GMT
- Title: Methodological Explainability Evaluation of an Interpretable Deep Learning Model for Post-Hepatectomy Liver Failure Prediction Incorporating Counterfactual Explanations and Layerwise Relevance Propagation: A Prospective In Silico Trial
- Authors: Xian Zhong, Zohaib Salahuddin, Yi Chen, Henry C Woodruff, Haiyi Long, Jianyun Peng, Nuwan Udawatte, Roberto Casale, Ayoub Mokhtari, Xiaoer Zhang, Jiayao Huang, Qingyu Wu, Li Tan, Lili Chen, Dongming Li, Xiaoyan Xie, Manxia Lin, Philippe Lambin,
- Abstract summary: We developed a variational autoencoder-multilayer perceptron (VAE-MLP) model for preoperative PHLF prediction.
This model integrated counterfactuals and layerwise relevance propagation (LRP) to provide insights into its decision-making mechanism.
Results from the three-track in silico clinical trial showed that clinicians' prediction accuracy and confidence increased when AI explanations were provided.
- Score: 13.171582596404313
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Artificial intelligence (AI)-based decision support systems have demonstrated value in predicting post-hepatectomy liver failure (PHLF) in hepatocellular carcinoma (HCC). However, they often lack transparency, and the impact of model explanations on clinicians' decisions has not been thoroughly evaluated. Building on prior research, we developed a variational autoencoder-multilayer perceptron (VAE-MLP) model for preoperative PHLF prediction. This model integrated counterfactuals and layerwise relevance propagation (LRP) to provide insights into its decision-making mechanism. Additionally, we proposed a methodological framework for evaluating the explainability of AI systems. This framework includes qualitative and quantitative assessments of explanations against recognized biomarkers, usability evaluations, and an in silico clinical trial. Our evaluations demonstrated that the model's explanation correlated with established biomarkers and exhibited high usability at both the case and system levels. Furthermore, results from the three-track in silico clinical trial showed that clinicians' prediction accuracy and confidence increased when AI explanations were provided.
Related papers
- Concept-based Explainable Malignancy Scoring on Pulmonary Nodules in CT Images [2.2120851074630177]
An interpretable model based on applying the generalized additive models and the concept-based learning is proposed.
The model detects a set of clinically significant attributes in addition to the final regression score and learns the association between the lung nodule attributes and a final diagnosis decision.
arXiv Detail & Related papers (2024-05-24T13:36:44Z) - Simulation-based Inference for Cardiovascular Models [57.92535897767929]
We use simulation-based inference to solve the inverse problem of mapping waveforms back to plausible physiological parameters.
We perform an in-silico uncertainty analysis of five biomarkers of clinical interest.
We study the gap between in-vivo and in-silico with the MIMIC-III waveform database.
arXiv Detail & Related papers (2023-07-26T02:34:57Z) - Evaluation of Popular XAI Applied to Clinical Prediction Models: Can
They be Trusted? [2.0089256058364358]
The absence of transparency and explainability hinders the clinical adoption of Machine learning (ML) algorithms.
This study evaluates two popular XAI methods used for explaining predictive models in the healthcare context.
arXiv Detail & Related papers (2023-06-21T02:29:30Z) - Explainable AI for Malnutrition Risk Prediction from m-Health and
Clinical Data [3.093890460224435]
This paper presents a novel AI framework for early and explainable malnutrition risk detection based on heterogeneous m-health data.
We performed an extensive model evaluation including both subject-independent and personalised predictions.
We also investigated several benchmark XAI methods to extract global model explanations.
arXiv Detail & Related papers (2023-05-31T08:07:35Z) - Informing clinical assessment by contextualizing post-hoc explanations
of risk prediction models in type-2 diabetes [50.8044927215346]
We consider a comorbidity risk prediction scenario and focus on contexts regarding the patients clinical state.
We employ several state-of-the-art LLMs to present contexts around risk prediction model inferences and evaluate their acceptability.
Our paper is one of the first end-to-end analyses identifying the feasibility and benefits of contextual explanations in a real-world clinical use case.
arXiv Detail & Related papers (2023-02-11T18:07:11Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Assessing the communication gap between AI models and healthcare
professionals: explainability, utility and trust in AI-driven clinical
decision-making [1.7809957179929814]
This paper contributes with a pragmatic evaluation framework for explainable Machine Learning (ML) models for clinical decision support.
The study revealed a more nuanced role for ML explanation models, when these are pragmatically embedded in the clinical context.
arXiv Detail & Related papers (2022-04-11T11:59:04Z) - What Do You See in this Patient? Behavioral Testing of Clinical NLP
Models [69.09570726777817]
We introduce an extendable testing framework that evaluates the behavior of clinical outcome models regarding changes of the input.
We show that model behavior varies drastically even when fine-tuned on the same data and that allegedly best-performing models have not always learned the most medically plausible patterns.
arXiv Detail & Related papers (2021-11-30T15:52:04Z) - Quantifying Explainability in NLP and Analyzing Algorithms for
Performance-Explainability Tradeoff [0.0]
We explore the current art of explainability and interpretability within a case study in clinical text classification.
We demonstrate various visualization techniques for fully interpretable methods as well as model-agnostic post hoc attributions.
We introduce a framework through which practitioners and researchers can assess the frontier between a model's predictive performance and the quality of its available explanations.
arXiv Detail & Related papers (2021-07-12T19:07:24Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.