Obtaining Faithful Interpretations from Compositional Neural Networks
- URL: http://arxiv.org/abs/2005.00724v2
- Date: Tue, 8 Sep 2020 15:52:28 GMT
- Title: Obtaining Faithful Interpretations from Compositional Neural Networks
- Authors: Sanjay Subramanian, Ben Bogin, Nitish Gupta, Tomer Wolfson, Sameer
Singh, Jonathan Berant, Matt Gardner
- Abstract summary: We evaluate the intermediate outputs of NMNs on NLVR2 and DROP datasets.
We find that the intermediate outputs differ from the expected output, illustrating that the network structure does not provide a faithful explanation of model behaviour.
- Score: 72.41100663462191
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural module networks (NMNs) are a popular approach for modeling
compositionality: they achieve high accuracy when applied to problems in
language and vision, while reflecting the compositional structure of the
problem in the network architecture. However, prior work implicitly assumed
that the structure of the network modules, describing the abstract reasoning
process, provides a faithful explanation of the model's reasoning; that is,
that all modules perform their intended behaviour. In this work, we propose and
conduct a systematic evaluation of the intermediate outputs of NMNs on NLVR2
and DROP, two datasets which require composing multiple reasoning steps. We
find that the intermediate outputs differ from the expected output,
illustrating that the network structure does not provide a faithful explanation
of model behaviour. To remedy that, we train the model with auxiliary
supervision and propose particular choices for module architecture that yield
much better faithfulness, at a minimal cost to accuracy.
Related papers
- Neural Networks Decoded: Targeted and Robust Analysis of Neural Network Decisions via Causal Explanations and Reasoning [9.947555560412397]
We introduce TRACER, a novel method grounded in causal inference theory to estimate the causal dynamics underpinning DNN decisions.
Our approach systematically intervenes on input features to observe how specific changes propagate through the network, affecting internal activations and final outputs.
TRACER further enhances explainability by generating counterfactuals that reveal possible model biases and offer contrastive explanations for misclassifications.
arXiv Detail & Related papers (2024-10-07T20:44:53Z) - Jointly-Learned Exit and Inference for a Dynamic Neural Network : JEI-DNN [20.380620709345898]
Early-exiting dynamic neural networks (EDNN) allow a model to make some of its predictions from intermediate layers (i.e., early-exit)
Training an EDNN architecture is challenging as it consists of two intertwined components: the gating mechanism (GM) that controls early-exiting decisions and the intermediate inference modules (IMs) that perform inference from intermediate representations.
We propose a novel architecture that connects these two modules. This leads to significant performance improvements on classification datasets and enables better uncertainty characterization capabilities.
arXiv Detail & Related papers (2023-10-13T14:56:38Z) - DProtoNet: Decoupling the inference module and the explanation module
enables neural networks to have better accuracy and interpretability [5.333582981327497]
In the previous method, by modifying the architecture of the neural network, the network simulates the human reasoning process.
We propose DProtoNet (Decoupling Prototypical network), it stores the decision basis of the neural network by using feature masks.
It decouples the neural network inference module from the interpretation module, and removes the specific architectural limitations of the interpretable network.
arXiv Detail & Related papers (2022-10-15T17:05:55Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - Modeling Structure with Undirected Neural Networks [20.506232306308977]
We propose undirected neural networks, a flexible framework for specifying computations that can be performed in any order.
We demonstrate the effectiveness of undirected neural architectures, both unstructured and structured, on a range of tasks.
arXiv Detail & Related papers (2022-02-08T10:06:51Z) - Latent Network Embedding via Adversarial Auto-encoders [15.656374849760734]
We propose a latent network embedding model based on adversarial graph auto-encoders.
Under this framework, the problem of discovering latent structures is formulated as inferring the latent ties from partial observations.
arXiv Detail & Related papers (2021-09-30T16:49:46Z) - Paired Examples as Indirect Supervision in Latent Decision Models [109.76417071249945]
We introduce a way to leverage paired examples that provide stronger cues for learning latent decisions.
We apply our method to improve compositional question answering using neural module networks on the DROP dataset.
arXiv Detail & Related papers (2021-04-05T03:58:30Z) - Neural Function Modules with Sparse Arguments: A Dynamic Approach to
Integrating Information across Layers [84.57980167400513]
Neural Function Modules (NFM) aims to introduce the same structural capability into deep learning.
Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems.
The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm.
arXiv Detail & Related papers (2020-10-15T20:43:17Z) - Visual Concept Reasoning Networks [93.99840807973546]
A split-transform-merge strategy has been broadly used as an architectural constraint in convolutional neural networks for visual recognition tasks.
We propose to exploit this strategy and combine it with our Visual Concept Reasoning Networks (VCRNet) to enable reasoning between high-level visual concepts.
Our proposed model, VCRNet, consistently improves the performance by increasing the number of parameters by less than 1%.
arXiv Detail & Related papers (2020-08-26T20:02:40Z) - S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures.
We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.