Interpreting and improving deep-learning models with reality checks
- URL: http://arxiv.org/abs/2108.06847v2
- Date: Thu, 19 Aug 2021 03:55:06 GMT
- Title: Interpreting and improving deep-learning models with reality checks
- Authors: Chandan Singh, Wooseok Ha, and Bin Yu
- Abstract summary: This chapter covers recent work aiming to interpret models by attributing importance to features and feature groups for a single prediction.
We show how these attributions can be used to directly improve the generalization of a neural network or to distill it into a simple model.
- Score: 13.287382944078562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent deep-learning models have achieved impressive predictive performance
by learning complex functions of many variables, often at the cost of
interpretability. This chapter covers recent work aiming to interpret models by
attributing importance to features and feature groups for a single prediction.
Importantly, the proposed attributions assign importance to interactions
between features, in addition to features in isolation. These attributions are
shown to yield insights across real-world domains, including bio-imaging,
cosmology image and natural-language processing. We then show how these
attributions can be used to directly improve the generalization of a neural
network or to distill it into a simple model. Throughout the chapter, we
emphasize the use of reality checks to scrutinize the proposed interpretation
techniques.
Related papers
- A distributional simplicity bias in the learning dynamics of transformers [50.91742043564049]
We show that transformers, trained on natural language data, also display a simplicity bias.
Specifically, they sequentially learn many-body interactions among input tokens, reaching a saturation point in the prediction error for low-degree interactions.
This approach opens up the possibilities of studying how interactions of different orders in the data affect learning, in natural language processing and beyond.
arXiv Detail & Related papers (2024-10-25T15:39:34Z) - Corpus Considerations for Annotator Modeling and Scaling [9.263562546969695]
We show that the commonly used user token model consistently outperforms more complex models.
Our findings shed light on the relationship between corpus statistics and annotator modeling performance.
arXiv Detail & Related papers (2024-04-02T22:27:24Z) - How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - Harmonizing Feature Attributions Across Deep Learning Architectures:
Enhancing Interpretability and Consistency [2.2237337682863125]
This study examines the generalization of feature attributions across various deep learning architectures.
We aim to develop a more coherent and optimistic understanding of feature attributions.
Our findings highlight the potential for harmonized feature attribution methods to improve interpretability and foster trust in machine learning applications.
arXiv Detail & Related papers (2023-07-05T09:46:41Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - Structure-Regularized Attention for Deformable Object Representation [17.120035855774344]
Capturing contextual dependencies has proven useful to improve the representational power of deep neural networks.
Recent approaches that focus on modeling global context, such as self-attention and non-local operation, achieve this goal by enabling unconstrained pairwise interactions between elements.
We consider learning representations for deformable objects which can benefit from context exploitation by modeling the structural dependencies that the data intrinsically possesses.
arXiv Detail & Related papers (2021-06-12T03:10:17Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - Refining Neural Networks with Compositional Explanations [31.84868477264624]
We propose to refine a learned model by collecting human-provided compositional explanations on the models' failure cases.
We demonstrate the effectiveness of the proposed approach on two text classification tasks.
arXiv Detail & Related papers (2021-03-18T17:48:54Z) - A Framework to Learn with Interpretation [2.3741312212138896]
We present a novel framework to jointly learn a predictive model and its associated interpretation model.
We seek for a small-size dictionary of high level attribute functions that take as inputs the outputs of selected hidden layers.
A detailed pipeline to visualize the learnt features is also developed.
arXiv Detail & Related papers (2020-10-19T09:26:28Z) - Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented.
It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts.
Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.