Related papers: Towards Fully Interpretable Deep Neural Networks: Are We There Yet?

Towards Fully Interpretable Deep Neural Networks: Are We There Yet?

URL: http://arxiv.org/abs/2106.13164v1
Date: Thu, 24 Jun 2021 16:37:34 GMT
Title: Towards Fully Interpretable Deep Neural Networks: Are We There Yet?
Authors: Sandareka Wickramanayake, Wynne Hsu, Mong Li Lee
Abstract summary: Deep Neural Networks (DNNs) behave as black-boxes hindering user trust in Artificial Intelligence (AI) systems. This paper provides a review of existing methods to develop DNNs with intrinsic interpretability.
Score: 17.88784870849724
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the remarkable performance, Deep Neural Networks (DNNs) behave as black-boxes hindering user trust in Artificial Intelligence (AI) systems. Research on opening black-box DNN can be broadly categorized into post-hoc methods and inherently interpretable DNNs. While many surveys have been conducted on post-hoc interpretation methods, little effort is devoted to inherently interpretable DNNs. This paper provides a review of existing methods to develop DNNs with intrinsic interpretability, with a focus on Convolutional Neural Networks (CNNs). The aim is to understand the current progress towards fully interpretable DNNs that can cater to different interpretation requirements. Finally, we identify gaps in current work and suggest potential research directions.

Related papers

Extracting Explanations, Justification, and Uncertainty from Black-Box Deep Neural Networks [0.0]
We propose a novel Bayesian approach to extract explanations, justifications, and uncertainty estimates from Deep Neural Networks. Our approach is efficient both in terms of memory and computation, and can be applied to any black box DNN without any retraining.
arXiv Detail & Related papers (2024-03-13T16:06:26Z)
Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient [11.0542573074431]
Spiking Neural Networks (SNNs) are recognized as the candidate for the next-generation neural networks due to their bio-plausibility and energy efficiency. Recently, researchers have demonstrated that SNNs are able to achieve nearly state-of-the-art performance in image recognition tasks using surrogate gradient training.
arXiv Detail & Related papers (2023-04-25T19:08:29Z)
Trustworthy Graph Neural Networks: Aspects, Methods and Trends [115.84291569988748]
Graph neural networks (GNNs) have emerged as competent graph learning methods for diverse real-world scenarios. Performance-oriented GNNs have exhibited potential adverse effects like vulnerability to adversarial attacks. To avoid these unintentional harms, it is necessary to build competent GNNs characterised by trustworthiness.
arXiv Detail & Related papers (2022-05-16T02:21:09Z)
A Comprehensive Survey on Trustworthy Graph Neural Networks: Privacy, Robustness, Fairness, and Explainability [59.80140875337769]
Graph Neural Networks (GNNs) have made rapid developments in the recent years. GNNs can leak private information, are vulnerable to adversarial attacks, can inherit and magnify societal bias from training data. This paper gives a comprehensive survey of GNNs in the computational aspects of privacy, robustness, fairness, and explainability.
arXiv Detail & Related papers (2022-04-18T21:41:07Z)
Robustness of Bayesian Neural Networks to White-Box Adversarial Attacks [55.531896312724555]
Bayesian Networks (BNNs) are robust and adept at handling adversarial attacks by incorporating randomness. We create our BNN model, called BNN-DenseNet, by fusing Bayesian inference (i.e., variational Bayes) to the DenseNet architecture. An adversarially-trained BNN outperforms its non-Bayesian, adversarially-trained counterpart in most experiments.
arXiv Detail & Related papers (2021-11-16T16:14:44Z)
Explaining Bayesian Neural Networks [11.296451806040796]
XAI aims to make advanced learning machines such as Deep Neural Networks (DNNs) more transparent in decision making. BNNs so far have a limited form of transparency (model transparency) already built-in through their prior weight distribution. In this work, we bring together these two perspectives of transparency into a holistic explanation framework for explaining BNNs.
arXiv Detail & Related papers (2021-08-23T18:09:41Z)
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution. Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs. Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z)
Deep Neural Networks Are Congestion Games: From Loss Landscape to Wardrop Equilibrium and Beyond [12.622643370707328]
We argue that our work provides a very promising novel tool for analyzing the deep neural networks (DNNs) We show how one can benefit from the classic readily available results from the latter when analyzing the former.
arXiv Detail & Related papers (2020-10-21T14:11:40Z)
Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings. DNNs are often treated as black box systems, which complicates their evaluation and validation. One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z)
Architecture Disentanglement for Deep Neural Networks [174.16176919145377]
We introduce neural architecture disentanglement (NAD) to explain the inner workings of deep neural networks (DNNs) NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes. Results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones.
arXiv Detail & Related papers (2020-03-30T08:34:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.