Generating detailed saliency maps using model-agnostic methods
- URL: http://arxiv.org/abs/2209.09202v1
- Date: Sun, 4 Sep 2022 21:34:46 GMT
- Title: Generating detailed saliency maps using model-agnostic methods
- Authors: Maciej Sakowicz
- Abstract summary: We focus on a model-agnostic explainability method called RISE, elaborate on observed shortcomings of its grid-based approach.
modifications, collectively called VRISE (Voronoi-RISE), are meant to, respectively, improve the accuracy of maps generated using large occlusions.
We compare accuracy of saliency maps produced by VRISE and RISE on the validation split of ILSVRC2012, using a saliency-guided content insertion/deletion metric and a localization metric based on bounding boxes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The emerging field of Explainable Artificial Intelligence focuses on
researching methods of explaining the decision making processes of complex
machine learning models. In the field of explainability for Computer Vision,
explanations are provided as saliency maps, which visualize the importance of
individual pixels of the input w.r.t. the model's prediction. In this work we
focus on a perturbation-based, model-agnostic explainability method called
RISE, elaborate on observed shortcomings of its grid-based approach and propose
two modifications: replacement of square occlusions with convex polygonal
occlusions based on cells of a Voronoi mesh and addition of an informativeness
guarantee to the occlusion mask generator. These modifications, collectively
called VRISE (Voronoi-RISE), are meant to, respectively, improve the accuracy
of maps generated using large occlusions and accelerate convergence of saliency
maps in cases where sampling density is either very low or very high. We
perform a quantitative comparison of accuracy of saliency maps produced by
VRISE and RISE on the validation split of ILSVRC2012, using a saliency-guided
content insertion/deletion metric and a localization metric based on bounding
boxes. Additionally, we explore the space of configurable occlusion pattern
parameters to better understand their influence on saliency maps produced by
RISE and VRISE. We also describe and demonstrate two effects observed over the
course of experimentation, arising from the random sampling approach of RISE:
"feature slicing" and "saliency misattribution". Our results show that convex
polygonal occlusions yield more accurate maps for coarse occlusion meshes and
multi-object images, but improvement is not guaranteed in other cases. The
informativeness guarantee is shown to increase the convergence rate without
incurring a significant computational overhead.
Related papers
- LGU-SLAM: Learnable Gaussian Uncertainty Matching with Deformable Correlation Sampling for Deep Visual SLAM [11.715999663401591]
Learnable 2D Gaussian uncertainty model is designed to associate matching-frame pairs.
A multi-scale deformable correlation strategy is devised to adaptively fine-tune the sampling of each direction.
Experiments on real-world and synthetic datasets are conducted to validate the effectiveness and superiority of our method.
arXiv Detail & Related papers (2024-10-30T17:20:08Z) - Learning Gaussian Representation for Eye Fixation Prediction [54.88001757991433]
Existing eye fixation prediction methods perform the mapping from input images to the corresponding dense fixation maps generated from raw fixation points.
We introduce Gaussian Representation for eye fixation modeling.
We design our framework upon some lightweight backbones to achieve real-time fixation prediction.
arXiv Detail & Related papers (2024-03-21T20:28:22Z) - Boosting Few-shot Fine-grained Recognition with Background Suppression
and Foreground Alignment [53.401889855278704]
Few-shot fine-grained recognition (FS-FGR) aims to recognize novel fine-grained categories with the help of limited available samples.
We propose a two-stage background suppression and foreground alignment framework, which is composed of a background activation suppression (BAS) module, a foreground object alignment (FOA) module, and a local to local (L2L) similarity metric.
Experiments conducted on multiple popular fine-grained benchmarks demonstrate that our method outperforms the existing state-of-the-art by a large margin.
arXiv Detail & Related papers (2022-10-04T07:54:40Z) - Fine-grained Classification of Solder Joints with {\alpha}-skew
Jensen-Shannon Divergence [0.0]
We show that solders have low feature diversity, and that the solder joint inspection can be carried out as a fine-grained image classification task.
To improve the fine-grained classification accuracy, penalizing confident model predictions by maximizing entropy was found useful in the literature.
We show that the proposed approach achieves the highest F1-score and competitive accuracy for different models in the finegrained solder joint classification task.
arXiv Detail & Related papers (2022-09-20T17:06:51Z) - PANet: Perspective-Aware Network with Dynamic Receptive Fields and
Self-Distilling Supervision for Crowd Counting [63.84828478688975]
We propose a novel perspective-aware approach called PANet to address the perspective problem.
Based on the observation that the size of the objects varies greatly in one image due to the perspective effect, we propose the dynamic receptive fields (DRF) framework.
The framework is able to adjust the receptive field by the dilated convolution parameters according to the input image, which helps the model to extract more discriminative features for each local region.
arXiv Detail & Related papers (2021-10-31T04:43:05Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - Spatial-spectral Hyperspectral Image Classification via Multiple Random
Anchor Graphs Ensemble Learning [88.60285937702304]
This paper proposes a novel spatial-spectral HSI classification method via multiple random anchor graphs ensemble learning (RAGE)
Firstly, the local binary pattern is adopted to extract the more descriptive features on each selected band, which preserves local structures and subtle changes of a region.
Secondly, the adaptive neighbors assignment is introduced in the construction of anchor graph, to reduce the computational complexity.
arXiv Detail & Related papers (2021-03-25T09:31:41Z) - Improved Slice-wise Tumour Detection in Brain MRIs by Computing
Dissimilarities between Latent Representations [68.8204255655161]
Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods.
We have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder.
We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines.
arXiv Detail & Related papers (2020-07-24T14:02:09Z) - Statistical Outlier Identification in Multi-robot Visual SLAM using
Expectation Maximization [18.259478519717426]
This paper introduces a novel and distributed method for detecting inter-map loop closure outliers in simultaneous localization and mapping (SLAM)
The proposed algorithm does not rely on a good initialization and can handle more than two maps at a time.
arXiv Detail & Related papers (2020-02-07T06:34:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.