Breaking the Paradox of Explainable Deep Learning
- URL: http://arxiv.org/abs/2305.13072v1
- Date: Mon, 22 May 2023 14:41:17 GMT
- Title: Breaking the Paradox of Explainable Deep Learning
- Authors: Arlind Kadra, Sebastian Pineda Arango, Josif Grabocka
- Abstract summary: We propose a novel method that trains deep hypernetworks to generate explainable linear models.
Our models retain the accuracy of black-box deep networks while offering free lunch explainability by design.
- Score: 13.320917259299652
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Learning has achieved tremendous results by pushing the frontier of
automation in diverse domains. Unfortunately, current neural network
architectures are not explainable by design. In this paper, we propose a novel
method that trains deep hypernetworks to generate explainable linear models.
Our models retain the accuracy of black-box deep networks while offering free
lunch explainability by design. Specifically, our explainable approach requires
the same runtime and memory resources as black-box deep models, ensuring
practical feasibility. Through extensive experiments, we demonstrate that our
explainable deep networks are as accurate as state-of-the-art classifiers on
tabular data. On the other hand, we showcase the interpretability of our method
on a recent benchmark by empirically comparing prediction explainers. The
experimental results reveal that our models are not only as accurate as their
black-box deep-learning counterparts but also as interpretable as
state-of-the-art explanation techniques.
Related papers
- Counterfactual Explanations for Deep Learning-Based Traffic Forecasting [42.31238891397725]
This study aims to leverage an Explainable AI approach, counterfactual explanations, to enhance the explainability and usability of deep learning-based traffic forecasting models.
The study first implements a deep learning model to predict traffic speed based on historical traffic data and contextual variables.
Counterfactual explanations are then used to illuminate how alterations in these input variables affect predicted outcomes.
arXiv Detail & Related papers (2024-05-01T11:26:31Z) - Explaining Explainability: Towards Deeper Actionable Insights into Deep
Learning through Second-order Explainability [70.60433013657693]
Second-order explainable AI (SOXAI) was recently proposed to extend explainable AI (XAI) from the instance level to the dataset level.
We demonstrate for the first time, via example classification and segmentation cases, that eliminating irrelevant concepts from the training set based on actionable insights from SOXAI can enhance a model's performance.
arXiv Detail & Related papers (2023-06-14T23:24:01Z) - Adversarial Attacks on the Interpretation of Neuron Activation
Maximization [70.5472799454224]
Activation-maximization approaches are used to interpret and analyze trained deep-learning models.
In this work, we consider the concept of an adversary manipulating a model for the purpose of deceiving the interpretation.
arXiv Detail & Related papers (2023-06-12T19:54:33Z) - Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models.
We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Neurashed: A Phenomenological Model for Imitating Deep Learning Training [17.006003864727408]
We argue that a future deep learning theory should inherit three characteristics: a textitarchhierically structured network architecture, parameters textititeratively optimized using gradient-based methods, and information from the data that evolves textitcompressively
We integrate these characteristics into a graphical model called textitneurashed, which effectively explains some common empirical patterns in deep learning.
arXiv Detail & Related papers (2021-12-17T19:51:26Z) - Deriving Explanation of Deep Visual Saliency Models [6.808418311272862]
We develop a technique to derive explainable saliency models from their corresponding deep neural architecture based saliency models.
We consider two state-of-the-art deep saliency models, namely UNISAL and MSI-Net for our interpretation.
We also build our own deep saliency model named cross-concatenated multi-scale residual block based network (CMRNet) for saliency prediction.
arXiv Detail & Related papers (2021-09-08T12:22:32Z) - Leveraging Sparse Linear Layers for Debuggable Deep Networks [86.94586860037049]
We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks.
The resulting sparse explanations can help to identify spurious correlations, explain misclassifications, and diagnose model biases in vision and language tasks.
arXiv Detail & Related papers (2021-05-11T08:15:25Z) - Model extraction from counterfactual explanations [68.8204255655161]
We show how an adversary can leverage the information provided by counterfactual explanations to build high-fidelity and high-accuracy model extraction attacks.
Our attack enables the adversary to build a faithful copy of a target model by accessing its counterfactual explanations.
arXiv Detail & Related papers (2020-09-03T19:02:55Z) - Making Neural Networks Interpretable with Attribution: Application to
Implicit Signals Prediction [11.427019313283997]
We propose a novel formulation of interpretable deep neural networks for the attribution task.
Using masked weights, hidden features can be deeply attributed, split into several input-restricted sub-networks and trained as a boosted mixture of experts.
arXiv Detail & Related papers (2020-08-26T06:46:49Z) - Cracking the Black Box: Distilling Deep Sports Analytics [17.35421731343764]
This paper addresses the trade-off between Accuracy and Transparency for deep learning applied to sports analytics.
We build a simple and transparent model that mimics the output of the original deep learning model and represents the learned knowledge in an explicit interpretable way.
arXiv Detail & Related papers (2020-06-04T01:49:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.