Gradient Frequency Modulation for Visually Explaining Video
Understanding Models
- URL: http://arxiv.org/abs/2111.01215v1
- Date: Mon, 1 Nov 2021 19:07:58 GMT
- Title: Gradient Frequency Modulation for Visually Explaining Video
Understanding Models
- Authors: Xinmiao Lin, Wentao Bao, Matthew Wright, Yu Kong
- Abstract summary: We propose Frequency-based Extremal Perturbation (FEP) to explain a video understanding model's decisions.
We show in a range of experiments that FEP provides more faithfully represent the model's decisions compared to the existing state-of-the-art methods.
- Score: 39.70146574042422
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many applications, it is essential to understand why a machine learning
model makes the decisions it does, but this is inhibited by the black-box
nature of state-of-the-art neural networks. Because of this, increasing
attention has been paid to explainability in deep learning, including in the
area of video understanding. Due to the temporal dimension of video data, the
main challenge of explaining a video action recognition model is to produce
spatiotemporally consistent visual explanations, which has been ignored in the
existing literature. In this paper, we propose Frequency-based Extremal
Perturbation (F-EP) to explain a video understanding model's decisions. Because
the explanations given by perturbation methods are noisy and non-smooth both
spatially and temporally, we propose to modulate the frequencies of gradient
maps from the neural network model with a Discrete Cosine Transform (DCT). We
show in a range of experiments that F-EP provides more spatiotemporally
consistent explanations that more faithfully represent the model's decisions
compared to the existing state-of-the-art methods.
Related papers
- Model-based learning for multi-antenna multi-frequency location-to-channel mapping [6.067275317776295]
Implicit Neural Representation literature showed that classical neural architecture are biased towards learning low-frequency content.
This paper leverages the model-based machine learning paradigm to derive a problem-specific neural architecture from a propagation channel model.
arXiv Detail & Related papers (2024-06-17T13:09:25Z) - CNN-based explanation ensembling for dataset, representation and explanations evaluation [1.1060425537315088]
We explore the potential of ensembling explanations generated by deep classification models using convolutional model.
Through experimentation and analysis, we aim to investigate the implications of combining explanations to uncover a more coherent and reliable patterns of the model's behavior.
arXiv Detail & Related papers (2024-04-16T08:39:29Z) - Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process.
We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z) - Diffusion Priors for Dynamic View Synthesis from Monocular Videos [59.42406064983643]
Dynamic novel view synthesis aims to capture the temporal evolution of visual content within videos.
We first finetune a pretrained RGB-D diffusion model on the video frames using a customization technique.
We distill the knowledge from the finetuned model to a 4D representations encompassing both dynamic and static Neural Radiance Fields.
arXiv Detail & Related papers (2024-01-10T23:26:41Z) - Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models.
We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z) - This looks more like that: Enhancing Self-Explaining Models by
Prototypical Relevance Propagation [17.485732906337507]
We present a case study of the self-explaining network, ProtoPNet, in the presence of a spectrum of artifacts.
We introduce a novel method for generating more precise model-aware explanations.
In order to obtain a clean dataset, we propose to use multi-view clustering strategies for segregating the artifact images.
arXiv Detail & Related papers (2021-08-27T09:55:53Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - On the Post-hoc Explainability of Deep Echo State Networks for Time
Series Forecasting, Image and Video Classification [63.716247731036745]
echo state networks have attracted many stares through time, mainly due to the simplicity and computational efficiency of their learning algorithm.
This work addresses this issue by conducting an explainability study of Echo State Networks when applied to learning tasks with time series, image and video data.
Specifically, the study proposes three different techniques capable of eliciting understandable information about the knowledge grasped by these recurrent models.
arXiv Detail & Related papers (2021-02-17T08:56:33Z) - Explaining Motion Relevance for Activity Recognition in Video Deep
Learning Models [12.807049446839507]
A small subset of explainability techniques has been applied for interpretability of 3D Convolutional Neural Network models in activity recognition tasks.
We propose a selective relevance method for adapting the 2D explanation techniques to provide motion-specific explanations.
Our results show that the selective relevance method can not only provide insight on the role played by motion in the model's decision -- in effect, revealing and quantifying the model's spatial bias -- but the method also simplifies the resulting explanations for human consumption.
arXiv Detail & Related papers (2020-03-31T15:19:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.