Related papers: i-Algebra: Towards Interactive Interpretability of Deep Neural Networks

i-Algebra: Towards Interactive Interpretability of Deep Neural Networks

URL: http://arxiv.org/abs/2101.09301v1
Date: Fri, 22 Jan 2021 19:22:57 GMT
Title: i-Algebra: Towards Interactive Interpretability of Deep Neural Networks
Authors: Xinyang Zhang, Ren Pang, Shouling Ji, Fenglong Ma, Ting Wang
Abstract summary: We present i-Algebra, a first-of-its-kind interactive framework for interpreting deep neural networks (DNNs) At its core is a library of atomic, composable operators, which explain model behaviors at varying input granularity, during different inference stages, and from distinct interpretation perspectives. We conduct user studies in a set of representative analysis tasks, including inspecting adversarial inputs, resolving model inconsistency, and cleansing contaminated data, all demonstrating its promising usability.
Score: 41.13047686374529
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Providing explanations for deep neural networks (DNNs) is essential for their use in domains wherein the interpretability of decisions is a critical prerequisite. Despite the plethora of work on interpreting DNNs, most existing solutions offer interpretability in an ad hoc, one-shot, and static manner, without accounting for the perception, understanding, or response of end-users, resulting in their poor usability in practice. In this paper, we argue that DNN interpretability should be implemented as the interactions between users and models. We present i-Algebra, a first-of-its-kind interactive framework for interpreting DNNs. At its core is a library of atomic, composable operators, which explain model behaviors at varying input granularity, during different inference stages, and from distinct interpretation perspectives. Leveraging a declarative query language, users are enabled to build various analysis tools (e.g., "drill-down", "comparative", "what-if" analysis) via flexibly composing such operators. We prototype i-Algebra and conduct user studies in a set of representative analysis tasks, including inspecting adversarial inputs, resolving model inconsistency, and cleansing contaminated data, all demonstrating its promising usability.

Related papers

Debugging and Runtime Analysis of Neural Networks with VLMs (A Case Study) [20.420310876464924]
We show the utility of semantic heatmaps for fault localization in vision models. We propose a lightweight runtime analysis to detect and filter-out defects at runtime.
arXiv Detail & Related papers (2025-03-21T01:12:57Z)
Neural Networks Decoded: Targeted and Robust Analysis of Neural Network Decisions via Causal Explanations and Reasoning [9.947555560412397]
We introduce TRACER, a novel method grounded in causal inference theory to estimate the causal dynamics underpinning DNN decisions. Our approach systematically intervenes on input features to observe how specific changes propagate through the network, affecting internal activations and final outputs. TRACER further enhances explainability by generating counterfactuals that reveal possible model biases and offer contrastive explanations for misclassifications.
arXiv Detail & Related papers (2024-10-07T20:44:53Z)
InterpretCC: Intrinsic User-Centric Interpretability through Global Mixture of Experts [31.738009841932374]
Interpretability for neural networks is a trade-off between three key requirements. We present InterpretCC, a family of interpretable-by-design neural networks that guarantee human-centric interpretability.
arXiv Detail & Related papers (2024-02-05T11:55:50Z)
Understanding Before Recommendation: Semantic Aspect-Aware Review Exploitation via Large Language Models [53.337728969143086]
Recommendation systems harness user-item interactions like clicks and reviews to learn their representations. Previous studies improve recommendation accuracy and interpretability by modeling user preferences across various aspects and intents. We introduce a chain-based prompting approach to uncover semantic aspect-aware interactions.
arXiv Detail & Related papers (2023-12-26T15:44:09Z)
Adversarial Attacks on the Interpretation of Neuron Activation Maximization [70.5472799454224]
Activation-maximization approaches are used to interpret and analyze trained deep-learning models. In this work, we consider the concept of an adversary manipulating a model for the purpose of deceiving the interpretation.
arXiv Detail & Related papers (2023-06-12T19:54:33Z)
Hybrid CNN -Interpreter: Interpret local and global contexts for CNN-based Models [9.148791330175191]
Convolutional neural network (CNN) models have seen advanced improvements in performance in various domains. Lack of interpretability is a major barrier to assurance and regulation during operation for acceptance and deployment of AI-assisted applications. We propose a novel hybrid CNN-interpreter through:. An original forward propagation mechanism to examine the layer-specific prediction results for local interpretability. A new global interpretability that indicates the feature correlation and filter importance effects.
arXiv Detail & Related papers (2022-10-31T22:59:33Z)
Combining Discrete Choice Models and Neural Networks through Embeddings: Formulation, Interpretability and Performance [10.57079240576682]
This study proposes a novel approach that combines theory and data-driven choice models using Artificial Neural Networks (ANNs) In particular, we use continuous vector representations, called embeddings, for encoding categorical or discrete explanatory variables. Our models deliver state-of-the-art predictive performance, outperforming existing ANN-based models while drastically reducing the number of required network parameters.
arXiv Detail & Related papers (2021-09-24T15:55:31Z)
Interpretable Deep Learning: Interpretations, Interpretability, Trustworthiness, and Beyond [49.93153180169685]
We introduce and clarify two basic concepts-interpretations and interpretability-that people usually get confused. We elaborate the design of several recent interpretation algorithms, from different perspectives, through proposing a new taxonomy. We summarize the existing work in evaluating models' interpretability using "trustworthy" interpretation algorithms.
arXiv Detail & Related papers (2021-03-19T08:40:30Z)
Explaining and Improving Model Behavior with k Nearest Neighbor Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions. We show that kNN representations are effective at uncovering learned spurious associations. Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z)
Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models. We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges. We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z)
An Adversarial Approach for Explaining the Predictions of Deep Neural Networks [9.645196221785694]
We present a novel algorithm for explaining the predictions of a deep neural network (DNN) using adversarial machine learning. Our approach identifies the relative importance of input features in relation to the predictions based on the behavior of an adversarial attack on the DNN. Our analysis enables us to produce consistent and efficient explanations.
arXiv Detail & Related papers (2020-05-20T18:06:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.