Cross-Model Consensus of Explanations and Beyond for Image
Classification Models: An Empirical Study
- URL: http://arxiv.org/abs/2109.00707v1
- Date: Thu, 2 Sep 2021 04:50:45 GMT
- Title: Cross-Model Consensus of Explanations and Beyond for Image
Classification Models: An Empirical Study
- Authors: Xuhong Li, Haoyi Xiong, Siyu Huang, Shilei Ji, Dejing Dou
- Abstract summary: Among different sets of features, some common features might be used by the majority of models.
We propose the cross-model consensus of explanations to capture the common features.
We conduct extensive experiments using 80+ models on 5 datasets/tasks.
- Score: 34.672716006357675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing interpretation algorithms have found that, even deep models make the
same and right predictions on the same image, they might rely on different sets
of input features for classification. However, among these sets of features,
some common features might be used by the majority of models. In this paper, we
are wondering what are the common features used by various models for
classification and whether the models with better performance may favor those
common features. For this purpose, our works uses an interpretation algorithm
to attribute the importance of features (e.g., pixels or superpixels) as
explanations, and proposes the cross-model consensus of explanations to capture
the common features. Specifically, we first prepare a set of deep models as a
committee, then deduce the explanation for every model, and obtain the
consensus of explanations across the entire committee through voting. With the
cross-model consensus of explanations, we conduct extensive experiments using
80+ models on 5 datasets/tasks. We find three interesting phenomena as follows:
(1) the consensus obtained from image classification models is aligned with the
ground truth of semantic segmentation; (2) we measure the similarity of the
explanation result of each model in the committee to the consensus (namely
consensus score), and find positive correlations between the consensus score
and model performance; and (3) the consensus score coincidentally correlates to
the interpretability.
Related papers
- Training objective drives the consistency of representational similarity across datasets [19.99817888941361]
The Platonic Representation Hypothesis claims that recent foundation models are converging to a shared representation space as a function of their downstream task performance.
Here, we propose a systematic way to measure how representational similarity between models varies with the set of stimuli used to construct the representations.
We find that the objective function is the most crucial factor in determining the consistency of representational similarities across datasets.
arXiv Detail & Related papers (2024-11-08T13:35:45Z) - Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models [65.82564074712836]
We introduce DIFfusionHOI, a new HOI detector shedding light on text-to-image diffusion models.
We first devise an inversion-based strategy to learn the expression of relation patterns between humans and objects in embedding space.
These learned relation embeddings then serve as textual prompts, to steer diffusion models generate images that depict specific interactions.
arXiv Detail & Related papers (2024-10-26T12:00:33Z) - Classes Are Not Equal: An Empirical Study on Image Recognition Fairness [100.36114135663836]
We experimentally demonstrate that classes are not equal and the fairness issue is prevalent for image classification models across various datasets.
Our findings reveal that models tend to exhibit greater prediction biases for classes that are more challenging to recognize.
Data augmentation and representation learning algorithms improve overall performance by promoting fairness to some degree in image classification.
arXiv Detail & Related papers (2024-02-28T07:54:50Z) - Challenges to Evaluating the Generalization of Coreference Resolution Models: A Measurement Modeling Perspective [69.50044040291847]
We show how multi-dataset evaluations risk conflating different factors concerning what, precisely, is being measured.
This makes it difficult to draw more generalizable conclusions from these evaluations.
arXiv Detail & Related papers (2023-03-16T05:32:02Z) - IMACS: Image Model Attribution Comparison Summaries [16.80986701058596]
We introduce IMACS, a method that combines gradient-based model attributions with aggregation and visualization techniques.
IMACS extracts salient input features from an evaluation dataset, clusters them based on similarity, then visualizes differences in model attributions for similar input features.
We show how our technique can uncover behavioral differences caused by domain shift between two models trained on satellite images.
arXiv Detail & Related papers (2022-01-26T21:35:14Z) - Partial Order in Chaos: Consensus on Feature Attributions in the
Rashomon Set [50.67431815647126]
Post-hoc global/local feature attribution methods are being progressively employed to understand machine learning models.
We show that partial orders of local/global feature importance arise from this methodology.
We show that every relation among features present in these partial orders also holds in the rankings provided by existing approaches.
arXiv Detail & Related papers (2021-10-26T02:53:14Z) - Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models.
Our method is based on projecting model representation to a latent space.
Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z) - Towards Visually Explaining Similarity Models [29.704524987493766]
We present a method to generate gradient-based visual attention for image similarity predictors.
By relying solely on the learned feature embedding, we show that our approach can be applied to any kind of CNN-based similarity architecture.
We show that our resulting attention maps serve more than just interpretability; they can be infused into the model learning process itself with new trainable constraints.
arXiv Detail & Related papers (2020-08-13T17:47:41Z) - Explainable Image Classification with Evidence Counterfactual [0.0]
We introduce SEDC as a model-agnostic instance-level explanation method for image classification.
For a given image, SEDC searches a small set of segments that, in case of removal, alters the classification.
We compare SEDC(-T) with popular feature importance methods such as LRP, LIME and SHAP, and we describe how the mentioned importance ranking issues are addressed.
arXiv Detail & Related papers (2020-04-16T08:02:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.