Related papers: Generating detailed saliency maps using model-agnostic methods

Generating detailed saliency maps using model-agnostic methods

URL: http://arxiv.org/abs/2209.09202v1
Date: Sun, 4 Sep 2022 21:34:46 GMT
Title: Generating detailed saliency maps using model-agnostic methods
Authors: Maciej Sakowicz
Abstract summary: We focus on a model-agnostic explainability method called RISE, elaborate on observed shortcomings of its grid-based approach. modifications, collectively called VRISE (Voronoi-RISE), are meant to, respectively, improve the accuracy of maps generated using large occlusions. We compare accuracy of saliency maps produced by VRISE and RISE on the validation split of ILSVRC2012, using a saliency-guided content insertion/deletion metric and a localization metric based on bounding boxes.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The emerging field of Explainable Artificial Intelligence focuses on researching methods of explaining the decision making processes of complex machine learning models. In the field of explainability for Computer Vision, explanations are provided as saliency maps, which visualize the importance of individual pixels of the input w.r.t. the model's prediction. In this work we focus on a perturbation-based, model-agnostic explainability method called RISE, elaborate on observed shortcomings of its grid-based approach and propose two modifications: replacement of square occlusions with convex polygonal occlusions based on cells of a Voronoi mesh and addition of an informativeness guarantee to the occlusion mask generator. These modifications, collectively called VRISE (Voronoi-RISE), are meant to, respectively, improve the accuracy of maps generated using large occlusions and accelerate convergence of saliency maps in cases where sampling density is either very low or very high. We perform a quantitative comparison of accuracy of saliency maps produced by VRISE and RISE on the validation split of ILSVRC2012, using a saliency-guided content insertion/deletion metric and a localization metric based on bounding boxes. Additionally, we explore the space of configurable occlusion pattern parameters to better understand their influence on saliency maps produced by RISE and VRISE. We also describe and demonstrate two effects observed over the course of experimentation, arising from the random sampling approach of RISE: "feature slicing" and "saliency misattribution". Our results show that convex polygonal occlusions yield more accurate maps for coarse occlusion meshes and multi-object images, but improvement is not guaranteed in other cases. The informativeness guarantee is shown to increase the convergence rate without incurring a significant computational overhead.

Related papers

MindfulLIME: A Stable Solution for Explanations of Machine Learning Models with Enhanced Localization Precision -- A Medical Image Case Study [0.7373617024876725]
We propose MindfulLIME, a novel algorithm that generates visual explanations using a graph-based pruning algorithm and uncertainty sampling. Our experimental evaluation, conducted on a widely recognized chest X-ray dataset, confirms MindfulLIME's stability with a 100% success rate. MindfulLIME improves the localization precision of visual explanations by reducing the distance between the generated explanations and the actual local annotations.
arXiv Detail & Related papers (2025-03-25T14:48:14Z)
Language-Informed Hyperspectral Image Synthesis for Imbalanced-Small Sample Classification via Semi-Supervised Conditional Diffusion Model [1.9746060146273674]
This paper proposes Txt2HSI-LDM(VAE), a novel language-informed hyperspectral image synthesis method. To address the high-dimensionality of hyperspectral data, a universal variational autoencoder (VAE) is designed to map the data into a low-dimensional latent space. VAE decodes HSI from latent space generated by the diffusion model with the language conditions as input.
arXiv Detail & Related papers (2025-02-27T02:35:49Z)
LGU-SLAM: Learnable Gaussian Uncertainty Matching with Deformable Correlation Sampling for Deep Visual SLAM [11.715999663401591]
Learnable 2D Gaussian uncertainty model is designed to associate matching-frame pairs. A multi-scale deformable correlation strategy is devised to adaptively fine-tune the sampling of each direction. Experiments on real-world and synthetic datasets are conducted to validate the effectiveness and superiority of our method.
arXiv Detail & Related papers (2024-10-30T17:20:08Z)
Learning Gaussian Representation for Eye Fixation Prediction [54.88001757991433]
Existing eye fixation prediction methods perform the mapping from input images to the corresponding dense fixation maps generated from raw fixation points. We introduce Gaussian Representation for eye fixation modeling. We design our framework upon some lightweight backbones to achieve real-time fixation prediction.
arXiv Detail & Related papers (2024-03-21T20:28:22Z)
Boosting Few-shot Fine-grained Recognition with Background Suppression and Foreground Alignment [53.401889855278704]
Few-shot fine-grained recognition (FS-FGR) aims to recognize novel fine-grained categories with the help of limited available samples. We propose a two-stage background suppression and foreground alignment framework, which is composed of a background activation suppression (BAS) module, a foreground object alignment (FOA) module, and a local to local (L2L) similarity metric. Experiments conducted on multiple popular fine-grained benchmarks demonstrate that our method outperforms the existing state-of-the-art by a large margin.
arXiv Detail & Related papers (2022-10-04T07:54:40Z)
Fine-grained Classification of Solder Joints with {\alpha}-skew Jensen-Shannon Divergence [0.0]
We show that solders have low feature diversity, and that the solder joint inspection can be carried out as a fine-grained image classification task. To improve the fine-grained classification accuracy, penalizing confident model predictions by maximizing entropy was found useful in the literature. We show that the proposed approach achieves the highest F1-score and competitive accuracy for different models in the finegrained solder joint classification task.
arXiv Detail & Related papers (2022-09-20T17:06:51Z)
PANet: Perspective-Aware Network with Dynamic Receptive Fields and Self-Distilling Supervision for Crowd Counting [63.84828478688975]
We propose a novel perspective-aware approach called PANet to address the perspective problem. Based on the observation that the size of the objects varies greatly in one image due to the perspective effect, we propose the dynamic receptive fields (DRF) framework. The framework is able to adjust the receptive field by the dilated convolution parameters according to the input image, which helps the model to extract more discriminative features for each local region.
arXiv Detail & Related papers (2021-10-31T04:43:05Z)
CAMERAS: Enhanced Resolution And Sanity preserving Class Activation Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input. We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z)
Spatial-spectral Hyperspectral Image Classification via Multiple Random Anchor Graphs Ensemble Learning [88.60285937702304]
This paper proposes a novel spatial-spectral HSI classification method via multiple random anchor graphs ensemble learning (RAGE) Firstly, the local binary pattern is adopted to extract the more descriptive features on each selected band, which preserves local structures and subtle changes of a region. Secondly, the adaptive neighbors assignment is introduced in the construction of anchor graph, to reduce the computational complexity.
arXiv Detail & Related papers (2021-03-25T09:31:41Z)
Improved Slice-wise Tumour Detection in Brain MRIs by Computing Dissimilarities between Latent Representations [68.8204255655161]
Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods. We have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder. We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines.
arXiv Detail & Related papers (2020-07-24T14:02:09Z)
Statistical Outlier Identification in Multi-robot Visual SLAM using Expectation Maximization [18.259478519717426]
This paper introduces a novel and distributed method for detecting inter-map loop closure outliers in simultaneous localization and mapping (SLAM) The proposed algorithm does not rely on a good initialization and can handle more than two maps at a time.
arXiv Detail & Related papers (2020-02-07T06:34:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.