An Adversarial Approach for Explaining the Predictions of Deep Neural
Networks
- URL: http://arxiv.org/abs/2005.10284v4
- Date: Mon, 28 Sep 2020 16:17:36 GMT
- Title: An Adversarial Approach for Explaining the Predictions of Deep Neural
Networks
- Authors: Arash Rahnama and Andrew Tseng
- Abstract summary: We present a novel algorithm for explaining the predictions of a deep neural network (DNN) using adversarial machine learning.
Our approach identifies the relative importance of input features in relation to the predictions based on the behavior of an adversarial attack on the DNN.
Our analysis enables us to produce consistent and efficient explanations.
- Score: 9.645196221785694
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models have been successfully applied to a wide range of
applications including computer vision, natural language processing, and speech
recognition. A successful implementation of these models however, usually
relies on deep neural networks (DNNs) which are treated as opaque black-box
systems due to their incomprehensible complexity and intricate internal
mechanism. In this work, we present a novel algorithm for explaining the
predictions of a DNN using adversarial machine learning. Our approach
identifies the relative importance of input features in relation to the
predictions based on the behavior of an adversarial attack on the DNN. Our
algorithm has the advantage of being fast, consistent, and easy to implement
and interpret. We present our detailed analysis that demonstrates how the
behavior of an adversarial attack, given a DNN and a task, stays consistent for
any input test data point proving the generality of our approach. Our analysis
enables us to produce consistent and efficient explanations. We illustrate the
effectiveness of our approach by conducting experiments using a variety of
DNNs, tasks, and datasets. Finally, we compare our work with other well-known
techniques in the current literature.
Related papers
- Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Adversarial Attacks on the Interpretation of Neuron Activation
Maximization [70.5472799454224]
Activation-maximization approaches are used to interpret and analyze trained deep-learning models.
In this work, we consider the concept of an adversary manipulating a model for the purpose of deceiving the interpretation.
arXiv Detail & Related papers (2023-06-12T19:54:33Z) - Towards Better Explanations for Object Detection [0.0]
This paper proposes a method to explain the decision for any object detection model called D-CLOSE.
We performed tests on the MS-COCO dataset with the YOLOX model, which shows that our method outperforms D-RISE.
arXiv Detail & Related papers (2023-06-05T09:52:05Z) - DeepSeer: Interactive RNN Explanation and Debugging via State
Abstraction [10.110976560799612]
Recurrent Neural Networks (RNNs) have been widely used in Natural Language Processing (NLP) tasks.
DeepSeer is an interactive system that provides both global and local explanations of RNN behavior.
arXiv Detail & Related papers (2023-03-02T21:08:17Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Robustification of Online Graph Exploration Methods [59.50307752165016]
We study a learning-augmented variant of the classical, notoriously hard online graph exploration problem.
We propose an algorithm that naturally integrates predictions into the well-known Nearest Neighbor (NN) algorithm.
arXiv Detail & Related papers (2021-12-10T10:02:31Z) - i-Algebra: Towards Interactive Interpretability of Deep Neural Networks [41.13047686374529]
We present i-Algebra, a first-of-its-kind interactive framework for interpreting deep neural networks (DNNs)
At its core is a library of atomic, composable operators, which explain model behaviors at varying input granularity, during different inference stages, and from distinct interpretation perspectives.
We conduct user studies in a set of representative analysis tasks, including inspecting adversarial inputs, resolving model inconsistency, and cleansing contaminated data, all demonstrating its promising usability.
arXiv Detail & Related papers (2021-01-22T19:22:57Z) - Distillation of Weighted Automata from Recurrent Neural Networks using a
Spectral Approach [0.0]
This paper is an attempt to bridge the gap between deep learning and grammatical inference.
It provides an algorithm to extract a formal language from any recurrent neural network trained for language modelling.
arXiv Detail & Related papers (2020-09-28T07:04:15Z) - Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings.
DNNs are often treated as black box systems, which complicates their evaluation and validation.
One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z) - Bayesian Neural Networks: An Introduction and Survey [22.018605089162204]
This article introduces Bayesian Neural Networks (BNNs) and the seminal research regarding their implementation.
Different approximate inference methods are compared, and used to highlight where future research can improve on current methods.
arXiv Detail & Related papers (2020-06-22T06:30:15Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.