Related papers: How do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking

How do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking

URL: http://arxiv.org/abs/2004.14992v3
Date: Tue, 2 Mar 2021 10:12:19 GMT
Title: How do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking
Authors: Nicola De Cao, Michael Schlichtkrull, Wilker Aziz, Ivan Titov
Abstract summary: DiffMask learns to mask-out subsets of the input while maintaining differentiability. Decision to include or disregard an input token is made with a simple model based on intermediate hidden layers. This lets us not only plot attribution heatmaps but also analyze how decisions are formed across network layers.
Score: 70.92463223410225
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Attribution methods assess the contribution of inputs to the model prediction. One way to do so is erasure: a subset of inputs is considered irrelevant if it can be removed without affecting the prediction. Though conceptually simple, erasure's objective is intractable and approximate search remains expensive with modern deep NLP models. Erasure is also susceptible to the hindsight bias: the fact that an input can be dropped does not mean that the model `knows' it can be dropped. The resulting pruning is over-aggressive and does not reflect how the model arrives at the prediction. To deal with these challenges, we introduce Differentiable Masking. DiffMask learns to mask-out subsets of the input while maintaining differentiability. The decision to include or disregard an input token is made with a simple model based on intermediate hidden layers of the analyzed model. First, this makes the approach efficient because we predict rather than search. Second, as with probing classifiers, this reveals what the network `knows' at the corresponding layers. This lets us not only plot attribution heatmaps but also analyze how decisions are formed across network layers. We use DiffMask to study BERT models on sentiment classification and question answering.

Related papers

SMOOT: Saliency Guided Mask Optimized Online Training [3.024318849346373]
Saliency-Guided Training (SGT) methods try to highlight the prominent features in the model's training based on the output. SGT makes the model's final result more interpretable by masking input partially. We propose a novel method to determine the optimal number of masked images based on input, accuracy, and model loss during the training.
arXiv Detail & Related papers (2023-10-01T19:41:49Z)
Confidence-Based Model Selection: When to Take Shortcuts for Subpopulation Shifts [119.22672589020394]
We propose COnfidence-baSed MOdel Selection (CosMoS), where model confidence can effectively guide model selection. We evaluate CosMoS on four datasets with spurious correlations, each with multiple test sets with varying levels of data distribution shift.
arXiv Detail & Related papers (2023-06-19T18:48:15Z)
Towards Improved Input Masking for Convolutional Neural Networks [66.99060157800403]
We propose a new masking method for CNNs we call layer masking. We show that our method is able to eliminate or minimize the influence of the mask shape or color on the output of the model. We also demonstrate how the shape of the mask may leak information about the class, thus affecting estimates of model reliance on class-relevant features.
arXiv Detail & Related papers (2022-11-26T19:31:49Z)
Interpretations Steered Network Pruning via Amortized Inferred Saliency Maps [85.49020931411825]
Convolutional Neural Networks (CNNs) compression is crucial to deploying these models in edge devices with limited resources. We propose to address the channel pruning problem from a novel perspective by leveraging the interpretations of a model to steer the pruning process. We tackle this challenge by introducing a selector model that predicts real-time smooth saliency masks for pruned models.
arXiv Detail & Related papers (2022-09-07T01:12:11Z)
Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews. We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z)
X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning. To take the power of both worlds, we propose a novel X-model. X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z)
Thought Flow Nets: From Single Predictions to Trains of Model Thought [39.619001911390804]
When humans solve complex problems, they rarely come up with a decision right-away. Instead, they start with an intuitive decision reflecting upon it, spot mistakes, resolve contradictions and jump between different hypotheses.
arXiv Detail & Related papers (2021-07-26T13:56:37Z)
PointMask: Towards Interpretable and Bias-Resilient Point Cloud Processing [16.470806722781333]
PointMask is a model-agnostic interpretable information-bottleneck approach for attribution in point cloud models. We show that coupling a PointMask layer with an arbitrary model can discern the points in the input space which contribute the most to the prediction score.
arXiv Detail & Related papers (2020-07-09T03:06:06Z)
Differentiable Language Model Adversarial Attacks on Categorical Sequence Classifiers [0.0]
An adversarial attack paradigm explores various scenarios for the vulnerability of deep learning models. We use a fine-tuning of a language model for adversarial attacks as a generator of adversarial examples. Our model works for diverse datasets on bank transactions, electronic health records, and NLP datasets.
arXiv Detail & Related papers (2020-06-19T11:25:36Z)
Auditing and Debugging Deep Learning Models via Decision Boundaries: Individual-level and Group-level Analysis [0.0]
We use flip points to explain, audit, and debug deep learning models. A flip point is any point that lies on the boundary between two output classes. We demonstrate our methods by investigating several models trained on standard datasets used in social applications of machine learning.
arXiv Detail & Related papers (2020-01-03T01:45:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.