Analogies and Feature Attributions for Model Agnostic Explanation of
Similarity Learners
- URL: http://arxiv.org/abs/2202.01153v1
- Date: Wed, 2 Feb 2022 17:28:56 GMT
- Title: Analogies and Feature Attributions for Model Agnostic Explanation of
Similarity Learners
- Authors: Karthikeyan Natesan Ramamurthy, Amit Dhurandhar, Dennis Wei, Zaid Bin
Tariq
- Abstract summary: We propose a method that provides feature attributions to explain the similarity between a pair of inputs as determined by a black box similarity learner.
Here the goal is to identify diverse analogous pairs of examples that share the same level of similarity as the input pair.
We prove that our analogy objective function is submodular, making the search for good-quality analogies efficient.
- Score: 29.63747822793279
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Post-hoc explanations for black box models have been studied extensively in
classification and regression settings. However, explanations for models that
output similarity between two inputs have received comparatively lesser
attention. In this paper, we provide model agnostic local explanations for
similarity learners applicable to tabular and text data. We first propose a
method that provides feature attributions to explain the similarity between a
pair of inputs as determined by a black box similarity learner. We then propose
analogies as a new form of explanation in machine learning. Here the goal is to
identify diverse analogous pairs of examples that share the same level of
similarity as the input pair and provide insight into (latent) factors
underlying the model's prediction. The selection of analogies can optionally
leverage feature attributions, thus connecting the two forms of explanation
while still maintaining complementarity. We prove that our analogy objective
function is submodular, making the search for good-quality analogies efficient.
We apply the proposed approaches to explain similarities between sentences as
predicted by a state-of-the-art sentence encoder, and between patients in a
healthcare utilization application. Efficacy is measured through quantitative
evaluations, a careful user study, and examples of explanations.
Related papers
- Rethinking Distance Metrics for Counterfactual Explainability [53.436414009687]
We investigate a framing for counterfactual generation methods that considers counterfactuals not as independent draws from a region around the reference, but as jointly sampled with the reference from the underlying data distribution.
We derive a distance metric, tailored for counterfactual similarity that can be applied to a broad range of settings.
arXiv Detail & Related papers (2024-10-18T15:06:50Z) - On the Information Content of Predictions in Word Analogy Tests [0.0]
An approach is proposed to quantify, in bits of information, the actual relevance of analogies in analogy tests.
The main component of this approach is a softaccuracy estimator that also yields entropy estimates with compensated biases.
arXiv Detail & Related papers (2022-10-18T16:32:25Z) - Logical Satisfiability of Counterfactuals for Faithful Explanations in
NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals.
It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation.
It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z) - Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models.
Our method is based on projecting model representation to a latent space.
Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z) - Toward Scalable and Unified Example-based Explanation and Outlier
Detection [128.23117182137418]
We argue for a broader adoption of prototype-based student networks capable of providing an example-based explanation for their prediction.
We show that our prototype-based networks beyond similarity kernels deliver meaningful explanations and promising outlier detection results without compromising classification accuracy.
arXiv Detail & Related papers (2020-11-11T05:58:17Z) - Towards Unifying Feature Attribution and Counterfactual Explanations:
Different Means to the Same End [17.226134854746267]
We present a method to generate feature attribution explanations from a set of counterfactual examples.
We show how counterfactual examples can be used to evaluate the goodness of an attribution-based explanation in terms of its necessity and sufficiency.
arXiv Detail & Related papers (2020-11-10T05:41:43Z) - Few-shot Visual Reasoning with Meta-analogical Contrastive Learning [141.2562447971]
We propose to solve a few-shot (or low-shot) visual reasoning problem, by resorting to analogical reasoning.
We extract structural relationships between elements in both domains, and enforce them to be as similar as possible with analogical learning.
We validate our method on RAVEN dataset, on which it outperforms state-of-the-art method, with larger gains when the training data is scarce.
arXiv Detail & Related papers (2020-07-23T14:00:34Z) - Towards Analogy-Based Explanations in Machine Learning [3.1410342959104725]
We argue that analogical reasoning is not less interesting from an interpretability and explainability point of view.
An analogy-based approach is a viable alternative to existing approaches in the realm of explainable AI and interpretable machine learning.
arXiv Detail & Related papers (2020-05-23T06:41:35Z) - Evaluating Explainable AI: Which Algorithmic Explanations Help Users
Predict Model Behavior? [97.77183117452235]
We carry out human subject tests to isolate the effect of algorithmic explanations on model interpretability.
Clear evidence of method effectiveness is found in very few cases.
Our results provide the first reliable and comprehensive estimates of how explanations influence simulatability.
arXiv Detail & Related papers (2020-05-04T20:35:17Z) - Building and Interpreting Deep Similarity Models [0.0]
We propose to make similarities interpretable by augmenting them with an explanation in terms of input features.
We develop BiLRP, a scalable and theoretically founded method to systematically decompose similarity scores on pairs of input features.
arXiv Detail & Related papers (2020-03-11T17:46:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.