Gradient Hedging for Intensively Exploring Salient Interpretation beyond
Neuron Activation
- URL: http://arxiv.org/abs/2205.11109v1
- Date: Mon, 23 May 2022 07:57:42 GMT
- Title: Gradient Hedging for Intensively Exploring Salient Interpretation beyond
Neuron Activation
- Authors: Woo-Jeoung Nam, Seong-Whan Lee
- Abstract summary: We introduce a method for decomposing output predictions into intensive salient attributions by hedging the evidence for a decision.
We analyze the conventional approach applied to the evidence for a decision and discuss the paradox of the conservation rule.
Our method outperforms existing attribution methods in distinctive, intensive, and intuitive visualization with robustness and applicability in general models.
- Score: 25.86943155064205
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hedging is a strategy for reducing the potential risks in various types of
investments by adopting an opposite position in a related asset. Motivated by
the equity technique, we introduce a method for decomposing output predictions
into intensive salient attributions by hedging the evidence for a decision. We
analyze the conventional approach applied to the evidence for a decision and
discuss the paradox of the conservation rule. Subsequently, we define the
viewpoint of evidence as a gap of positive and negative influence among the
gradient-derived initial contribution maps and propagate the antagonistic
elements to the evidence as suppressors, following the criterion of the degree
of positive attribution defined by user preference. In addition, we reflect the
severance or sparseness contribution of inactivated neurons, which are mostly
irrelevant to a decision, resulting in increased robustness to
interpretability. We conduct the following assessments in a verified
experimental environment: pointing game, most relevant first region insertion,
outside-inside relevance ratio, and mean average precision on the PASCAL VOC
2007, MS COCO 2014, and ImageNet datasets. The results demonstrate that our
method outperforms existing attribution methods in distinctive, intensive, and
intuitive visualization with robustness and applicability in general models.
Related papers
- Toward Understanding the Disagreement Problem in Neural Network Feature Attribution [0.8057006406834466]
neural networks have demonstrated their remarkable ability to discern intricate patterns and relationships from raw data.
Understanding the inner workings of these black box models remains challenging, yet crucial for high-stake decisions.
Our work addresses this confusion by investigating the explanations' fundamental and distributional behavior.
arXiv Detail & Related papers (2024-04-17T12:45:59Z) - Mitigating Feature Gap for Adversarial Robustness by Feature
Disentanglement [61.048842737581865]
Adversarial fine-tuning methods aim to enhance adversarial robustness through fine-tuning the naturally pre-trained model in an adversarial training manner.
We propose a disentanglement-based approach to explicitly model and remove the latent features that cause the feature gap.
Empirical evaluations on three benchmark datasets demonstrate that our approach surpasses existing adversarial fine-tuning methods and adversarial training baselines.
arXiv Detail & Related papers (2024-01-26T08:38:57Z) - Disentangled Representation for Causal Mediation Analysis [25.114619307838602]
Causal mediation analysis is a method that is often used to reveal direct and indirect effects.
Deep learning shows promise in mediation analysis, but the current methods only assume latent confounders that affect treatment, mediator and outcome simultaneously.
We propose the Disentangled Mediation Analysis Variational AutoEncoder (DMAVAE), which disentangles the representations of latent confounders into three types to accurately estimate the natural direct effect, natural indirect effect and total effect.
arXiv Detail & Related papers (2023-02-19T23:37:17Z) - Distributionally Robust Causal Inference with Observational Data [4.8986598953553555]
We consider the estimation of average treatment effects in observational studies without the standard assumption of unconfoundedness.
We propose a new framework of robust causal inference under the general observational study setting with the possible existence of unobserved confounders.
arXiv Detail & Related papers (2022-10-15T16:02:33Z) - Domain Adaptation with Adversarial Training on Penultimate Activations [82.9977759320565]
Enhancing model prediction confidence on unlabeled target data is an important objective in Unsupervised Domain Adaptation (UDA)
We show that this strategy is more efficient and better correlated with the objective of boosting prediction confidence than adversarial training on input images or intermediate features.
arXiv Detail & Related papers (2022-08-26T19:50:46Z) - CausPref: Causal Preference Learning for Out-of-Distribution
Recommendation [36.22965012642248]
The current recommender system is still vulnerable to the distribution shift of users and items in realistic scenarios.
We propose to incorporate the recommendation-specific DAG learner into a novel causal preference-based recommendation framework named CausPref.
Our approach surpasses the benchmark models significantly under types of out-of-distribution settings.
arXiv Detail & Related papers (2022-02-08T16:42:03Z) - Interpreting Deep Neural Networks with Relative Sectional Propagation by
Analyzing Comparative Gradients and Hostile Activations [37.11665902583138]
We propose a new attribution method, Relative Sectional Propagation (RSP), for decomposing the output predictions of Deep Neural Networks (DNNs)
We define hostile factor as an element that interferes with finding the attributions of the target and propagates it in a distinguishable way to overcome the non-suppressed nature of activated neurons.
Our method makes it possible to decompose the predictions of DNNs with clearer class-discriminativeness and detailed elucidations of activation neurons compared to the conventional attribution methods.
arXiv Detail & Related papers (2020-12-07T03:11:07Z) - Proactive Pseudo-Intervention: Causally Informed Contrastive Learning
For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI)
PPI leverages proactive interventions to guard against image features with no causal relevance.
We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z) - Loss Bounds for Approximate Influence-Based Abstraction [81.13024471616417]
Influence-based abstraction aims to gain leverage by modeling local subproblems together with the 'influence' that the rest of the system exerts on them.
This paper investigates the performance of such approaches from a theoretical perspective.
We show that neural networks trained with cross entropy are well suited to learn approximate influence representations.
arXiv Detail & Related papers (2020-11-03T15:33:10Z) - Face Anti-Spoofing Via Disentangled Representation Learning [90.90512800361742]
Face anti-spoofing is crucial to security of face recognition systems.
We propose a novel perspective of face anti-spoofing that disentangles the liveness features and content features from images.
arXiv Detail & Related papers (2020-08-19T03:54:23Z) - Learning Overlapping Representations for the Estimation of
Individualized Treatment Effects [97.42686600929211]
Estimating the likely outcome of alternatives from observational data is a challenging problem.
We show that algorithms that learn domain-invariant representations of inputs are often inappropriate.
We develop a deep kernel regression algorithm and posterior regularization framework that substantially outperforms the state-of-the-art on a variety of benchmarks data sets.
arXiv Detail & Related papers (2020-01-14T12:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.