SMOOT: Saliency Guided Mask Optimized Online Training
- URL: http://arxiv.org/abs/2310.00772v2
- Date: Tue, 10 Oct 2023 21:42:59 GMT
- Title: SMOOT: Saliency Guided Mask Optimized Online Training
- Authors: Ali Karkehabadi, Houman Homayoun, Avesta Sasan
- Abstract summary: Saliency-Guided Training (SGT) methods try to highlight the prominent features in the model's training based on the output.
SGT makes the model's final result more interpretable by masking input partially.
We propose a novel method to determine the optimal number of masked images based on input, accuracy, and model loss during the training.
- Score: 3.024318849346373
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep Neural Networks are powerful tools for understanding complex patterns
and making decisions. However, their black-box nature impedes a complete
understanding of their inner workings. Saliency-Guided Training (SGT) methods
try to highlight the prominent features in the model's training based on the
output to alleviate this problem. These methods use back-propagation and
modified gradients to guide the model toward the most relevant features while
keeping the impact on the prediction accuracy negligible. SGT makes the model's
final result more interpretable by masking input partially. In this way,
considering the model's output, we can infer how each segment of the input
affects the output. In the particular case of image as the input, masking is
applied to the input pixels. However, the masking strategy and number of pixels
which we mask, are considered as a hyperparameter. Appropriate setting of
masking strategy can directly affect the model's training. In this paper, we
focus on this issue and present our contribution. We propose a novel method to
determine the optimal number of masked images based on input, accuracy, and
model loss during the training. The strategy prevents information loss which
leads to better accuracy values. Also, by integrating the model's performance
in the strategy formula, we show that our model represents the salient features
more meaningful. Our experimental results demonstrate a substantial improvement
in both model accuracy and the prominence of saliency, thereby affirming the
effectiveness of our proposed solution.
Related papers
- Bootstrap Masked Visual Modeling via Hard Patches Mining [68.74750345823674]
Masked visual modeling has attracted much attention due to its promising potential in learning generalizable representations.
We argue that it is equally important for the model to stand in the shoes of a teacher to produce challenging problems by itself.
To empower the model as a teacher, we propose Hard Patches Mining (HPM), predicting patch-wise losses and subsequently determining where to mask.
arXiv Detail & Related papers (2023-12-21T10:27:52Z) - Hard Patches Mining for Masked Image Modeling [52.46714618641274]
Masked image modeling (MIM) has attracted much research attention due to its promising potential for learning scalable visual representations.
We propose Hard Patches Mining (HPM), a brand-new framework for MIM pre-training.
arXiv Detail & Related papers (2023-04-12T15:38:23Z) - Improving Masked Autoencoders by Learning Where to Mask [65.89510231743692]
Masked image modeling is a promising self-supervised learning method for visual data.
We present AutoMAE, a framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process.
In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
arXiv Detail & Related papers (2023-03-12T05:28:55Z) - Texture-Based Input Feature Selection for Action Recognition [3.9596068699962323]
We propose a novel method to determine the task-irrelevant content in inputs which increases the domain discrepancy.
We show that our proposed model is superior to existing models for action recognition on the HMDB-51 dataset and the Penn Action dataset.
arXiv Detail & Related papers (2023-02-28T23:56:31Z) - Towards Improved Input Masking for Convolutional Neural Networks [66.99060157800403]
We propose a new masking method for CNNs we call layer masking.
We show that our method is able to eliminate or minimize the influence of the mask shape or color on the output of the model.
We also demonstrate how the shape of the mask may leak information about the class, thus affecting estimates of model reliance on class-relevant features.
arXiv Detail & Related papers (2022-11-26T19:31:49Z) - Masked Autoencoding for Scalable and Generalizable Decision Making [93.84855114717062]
MaskDP is a simple and scalable self-supervised pretraining method for reinforcement learning and behavioral cloning.
We find that a MaskDP model gains the capability of zero-shot transfer to new BC tasks, such as single and multiple goal reaching.
arXiv Detail & Related papers (2022-11-23T07:04:41Z) - How do Decisions Emerge across Layers in Neural Models? Interpretation
with Differentiable Masking [70.92463223410225]
DiffMask learns to mask-out subsets of the input while maintaining differentiability.
Decision to include or disregard an input token is made with a simple model based on intermediate hidden layers.
This lets us not only plot attribution heatmaps but also analyze how decisions are formed across network layers.
arXiv Detail & Related papers (2020-04-30T17:36:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.