A Comprehensive and Reliable Feature Attribution Method: Double-sided
Remove and Reconstruct (DoRaR)
- URL: http://arxiv.org/abs/2310.17945v1
- Date: Fri, 27 Oct 2023 07:40:45 GMT
- Title: A Comprehensive and Reliable Feature Attribution Method: Double-sided
Remove and Reconstruct (DoRaR)
- Authors: Dong Qin, George Amariucai, Daji Qiao, Yong Guan, Shen Fu
- Abstract summary: We introduce the Double-sided Remove and Reconstruct (DoRaR) feature attribution method based on several improvement methods.
We demonstrate that the DoRaR feature attribution method can effectively bypass the above issues and can aid in training a feature selector that outperforms other state-of-the-art feature attribution methods.
- Score: 3.43406114216767
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The limited transparency of the inner decision-making mechanism in deep
neural networks (DNN) and other machine learning (ML) models has hindered their
application in several domains. In order to tackle this issue, feature
attribution methods have been developed to identify the crucial features that
heavily influence decisions made by these black box models. However, many
feature attribution methods have inherent downsides. For example, one category
of feature attribution methods suffers from the artifacts problem, which feeds
out-of-distribution masked inputs directly through the classifier that was
originally trained on natural data points. Another category of feature
attribution method finds explanations by using jointly trained feature
selectors and predictors. While avoiding the artifacts problem, this new
category suffers from the Encoding Prediction in the Explanation (EPITE)
problem, in which the predictor's decisions rely not on the features, but on
the masks that selects those features. As a result, the credibility of
attribution results is undermined by these downsides. In this research, we
introduce the Double-sided Remove and Reconstruct (DoRaR) feature attribution
method based on several improvement methods that addresses these issues. By
conducting thorough testing on MNIST, CIFAR10 and our own synthetic dataset, we
demonstrate that the DoRaR feature attribution method can effectively bypass
the above issues and can aid in training a feature selector that outperforms
other state-of-the-art feature attribution methods. Our code is available at
https://github.com/dxq21/DoRaR.
Related papers
- IMO: Greedy Layer-Wise Sparse Representation Learning for Out-of-Distribution Text Classification with Pre-trained Models [56.10157988449818]
This study focuses on a specific problem of domain generalization, where a model is trained on one source domain and tested on multiple target domains that are unseen during training.
We propose IMO: Invariant features Masks for Out-of-Distribution text classification, to achieve OOD generalization by learning invariant features.
arXiv Detail & Related papers (2024-04-21T02:15:59Z) - Exploring Diffusion Time-steps for Unsupervised Representation Learning [72.43246871893936]
We build a theoretical framework that connects the diffusion time-steps and the hidden attributes.
On CelebA, FFHQ, and Bedroom datasets, the learned feature significantly improves classification.
arXiv Detail & Related papers (2024-01-21T08:35:25Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Shielded Representations: Protecting Sensitive Attributes Through
Iterative Gradient-Based Projection [39.16319169760823]
Iterative Gradient-Based Projection is a novel method for removing non-linear encoded concepts from neural representations.
Our results demonstrate that IGBP is effective in mitigating bias through intrinsic and extrinsic evaluations.
arXiv Detail & Related papers (2023-05-17T13:26:57Z) - PARFormer: Transformer-based Multi-Task Network for Pedestrian Attribute
Recognition [23.814762073093153]
We propose a pure transformer-based multi-task PAR network named PARFormer, which includes four modules.
In the feature extraction module, we build a strong baseline for feature extraction, which achieves competitive results on several PAR benchmarks.
In the viewpoint perception module, we explore the impact of viewpoints on pedestrian attributes, and propose a multi-view contrastive loss.
In the attribute recognition module, we alleviate the negative-positive imbalance problem to generate the attribute predictions.
arXiv Detail & Related papers (2023-04-14T16:27:56Z) - Style Interleaved Learning for Generalizable Person Re-identification [69.03539634477637]
We propose a novel style interleaved learning (IL) framework for DG ReID training.
Unlike conventional learning strategies, IL incorporates two forward propagations and one backward propagation for each iteration.
We show that our model consistently outperforms state-of-the-art methods on large-scale benchmarks for DG ReID.
arXiv Detail & Related papers (2022-07-07T07:41:32Z) - Time to Focus: A Comprehensive Benchmark Using Time Series Attribution
Methods [4.9449660544238085]
The paper focuses on time series analysis and benchmark several state-of-the-art attribution methods.
The presented experiments involve gradient-based and perturbation-based attribution methods.
The findings accentuate that choosing the best-suited attribution method is strongly correlated with the desired use case.
arXiv Detail & Related papers (2022-02-08T10:06:13Z) - Can contrastive learning avoid shortcut solutions? [88.249082564465]
implicit feature modification (IFM) is a method for altering positive and negative samples in order to guide contrastive models towards capturing a wider variety of predictive features.
IFM reduces feature suppression, and as a result improves performance on vision and medical imaging tasks.
arXiv Detail & Related papers (2021-06-21T16:22:43Z) - Do Feature Attribution Methods Correctly Attribute Features? [5.58592454173439]
Feature attribution methods are exceedingly popular in interpretable machine learning.
There is no consensus on the definition of "attribution"
We evaluate three methods: saliency maps, rationales, and attention.
arXiv Detail & Related papers (2021-04-27T20:35:30Z) - Embedded methods for feature selection in neural networks [0.0]
Black box models like neural networks negatively affect the interpretability, generalizability, and the training time of these models.
I propose two integrated approaches for feature selection that can be incorporated directly into the parameter learning.
I benchmarked both the methods against Permutation Feature Importance (PFI) - a general-purpose feature ranking method and a random baseline.
arXiv Detail & Related papers (2020-10-12T16:33:46Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.