Using Interpretation Methods for Model Enhancement
- URL: http://arxiv.org/abs/2404.02068v1
- Date: Tue, 2 Apr 2024 16:10:29 GMT
- Title: Using Interpretation Methods for Model Enhancement
- Authors: Zhuo Chen, Chengyue Jiang, Kewei Tu,
- Abstract summary: We propose a framework of utilizing interpretation methods and gold rationales to enhance models.
Our framework is very general in the sense that it can incorporate various interpretation methods.
Experimental results show that our framework is effective especially in low-resource settings.
- Score: 44.29399911722625
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the age of neural natural language processing, there are plenty of works trying to derive interpretations of neural models. Intuitively, when gold rationales exist during training, one can additionally train the model to match its interpretation with the rationales. However, this intuitive idea has not been fully explored. In this paper, we propose a framework of utilizing interpretation methods and gold rationales to enhance models. Our framework is very general in the sense that it can incorporate various interpretation methods. Previously proposed gradient-based methods can be shown as an instance of our framework. We also propose two novel instances utilizing two other types of interpretation methods, erasure/replace-based and extractor-based methods, for model enhancement. We conduct comprehensive experiments on a variety of tasks. Experimental results show that our framework is effective especially in low-resource settings in enhancing models with various interpretation methods, and our two newly-proposed methods outperform gradient-based methods in most settings. Code is available at https://github.com/Chord-Chen-30/UIMER.
Related papers
- An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Explaining by Removing: A Unified Framework for Model Explanation [14.50261153230204]
Removal-based explanations are based on the principle of simulating feature removal to quantify each feature's influence.
We develop a framework that characterizes each method along three dimensions: 1) how the method removes features, 2) what model behavior the method explains, and 3) how the method summarizes each feature's influence.
This newly understood class of explanation methods has rich connections that we examine using tools that have been largely overlooked by the explainability literature.
arXiv Detail & Related papers (2020-11-21T00:47:48Z) - Feature Removal Is a Unifying Principle for Model Explanation Methods [14.50261153230204]
We examine the literature and find that many methods are based on a shared principle of explaining by removing.
We develop a framework for removal-based explanations that characterizes each method along three dimensions.
Our framework unifies 26 existing methods, including several of the most widely used approaches.
arXiv Detail & Related papers (2020-11-06T22:37:55Z) - Mutual Modality Learning for Video Action Classification [74.83718206963579]
We show how to embed multi-modality into a single model for video action classification.
We achieve state-of-the-art results in the Something-Something-v2 benchmark.
arXiv Detail & Related papers (2020-11-04T21:20:08Z) - Evaluating the Disentanglement of Deep Generative Models through
Manifold Topology [66.06153115971732]
We present a method for quantifying disentanglement that only uses the generative model.
We empirically evaluate several state-of-the-art models across multiple datasets.
arXiv Detail & Related papers (2020-06-05T20:54:11Z) - There and Back Again: Revisiting Backpropagation Saliency Methods [87.40330595283969]
Saliency methods seek to explain the predictions of a model by producing an importance map across each input sample.
A popular class of such methods is based on backpropagating a signal and analyzing the resulting gradient.
We propose a single framework under which several such methods can be unified.
arXiv Detail & Related papers (2020-04-06T17:58:08Z) - LIMEADE: From AI Explanations to Advice Taking [34.581205516506614]
We introduce LIMEADE, the first framework that translates both positive and negative advice into an update to an arbitrary, underlying opaque model.
We show our method improves accuracy compared to a rigorous baseline on the image classification domains.
For the text modality, we apply our framework to a neural recommender system for scientific papers on a public website.
arXiv Detail & Related papers (2020-03-09T18:00:00Z) - How Far are We from Effective Context Modeling? An Exploratory Study on
Semantic Parsing in Context [59.13515950353125]
We present a grammar-based decoding semantic parsing and adapt typical context modeling methods on top of it.
We evaluate 13 context modeling methods on two large cross-domain datasets, and our best model achieves state-of-the-art performances.
arXiv Detail & Related papers (2020-02-03T11:28:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.