COSE: A Consistency-Sensitivity Metric for Saliency on Image
Classification
- URL: http://arxiv.org/abs/2309.10989v1
- Date: Wed, 20 Sep 2023 01:06:44 GMT
- Title: COSE: A Consistency-Sensitivity Metric for Saliency on Image
Classification
- Authors: Rangel Daroya, Aaron Sun, Subhransu Maji
- Abstract summary: We present a set of metrics that utilize vision priors to assess the performance of saliency methods on image classification tasks.
We show that although saliency methods are thought to be architecture-independent, most methods could better explain transformer-based models over convolutional-based models.
- Score: 21.3855970055692
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a set of metrics that utilize vision priors to effectively assess
the performance of saliency methods on image classification tasks. To
understand behavior in deep learning models, many methods provide visual
saliency maps emphasizing image regions that most contribute to a model
prediction. However, there is limited work on analyzing the reliability of
saliency methods in explaining model decisions. We propose the metric
COnsistency-SEnsitivity (COSE) that quantifies the equivariant and invariant
properties of visual model explanations using simple data augmentations.
Through our metrics, we show that although saliency methods are thought to be
architecture-independent, most methods could better explain transformer-based
models over convolutional-based models. In addition, GradCAM was found to
outperform other methods in terms of COSE but was shown to have limitations
such as lack of variability for fine-grained datasets. The duality between
consistency and sensitivity allow the analysis of saliency methods from
different angles. Ultimately, we find that it is important to balance these two
metrics for a saliency map to faithfully show model behavior.
Related papers
- Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Intriguing Differences Between Zero-Shot and Systematic Evaluations of
Vision-Language Transformer Models [7.360937524701675]
Transformer-based models have dominated natural language processing and other areas in the last few years due to their superior (zero-shot) performance on benchmark datasets.
In this paper, based on a new gradient descent optimization method, we are able to explore the embedding space of a commonly used vision-language model.
Using the Imagenette dataset, we show that while the model achieves over 99% zero-shot classification performance, it fails systematic evaluations completely.
arXiv Detail & Related papers (2024-02-13T14:07:49Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - A Model for Multi-View Residual Covariances based on Perspective
Deformation [88.21738020902411]
We derive a model for the covariance of the visual residuals in multi-view SfM, odometry and SLAM setups.
We validate our model with synthetic and real data and integrate it into photometric and feature-based Bundle Adjustment.
arXiv Detail & Related papers (2022-02-01T21:21:56Z) - IMACS: Image Model Attribution Comparison Summaries [16.80986701058596]
We introduce IMACS, a method that combines gradient-based model attributions with aggregation and visualization techniques.
IMACS extracts salient input features from an evaluation dataset, clusters them based on similarity, then visualizes differences in model attributions for similar input features.
We show how our technique can uncover behavioral differences caused by domain shift between two models trained on satellite images.
arXiv Detail & Related papers (2022-01-26T21:35:14Z) - Distributional Depth-Based Estimation of Object Articulation Models [21.046351215949525]
We propose a method that efficiently learns distributions over articulation model parameters directly from depth images.
Our core contributions include a novel representation for distributions over rigid body transformations.
We introduce a novel deep learning based approach, DUST-net, that performs category-independent articulation model estimation.
arXiv Detail & Related papers (2021-08-12T17:44:51Z) - GELATO: Geometrically Enriched Latent Model for Offline Reinforcement
Learning [54.291331971813364]
offline reinforcement learning approaches can be divided into proximal and uncertainty-aware methods.
In this work, we demonstrate the benefit of combining the two in a latent variational model.
Our proposed metrics measure both the quality of out of distribution samples as well as the discrepancy of examples in the data.
arXiv Detail & Related papers (2021-02-22T19:42:40Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - Towards Visually Explaining Similarity Models [29.704524987493766]
We present a method to generate gradient-based visual attention for image similarity predictors.
By relying solely on the learned feature embedding, we show that our approach can be applied to any kind of CNN-based similarity architecture.
We show that our resulting attention maps serve more than just interpretability; they can be infused into the model learning process itself with new trainable constraints.
arXiv Detail & Related papers (2020-08-13T17:47:41Z) - Evaluating the Disentanglement of Deep Generative Models through
Manifold Topology [66.06153115971732]
We present a method for quantifying disentanglement that only uses the generative model.
We empirically evaluate several state-of-the-art models across multiple datasets.
arXiv Detail & Related papers (2020-06-05T20:54:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.