Feature Importance Estimation with Self-Attention Networks
- URL: http://arxiv.org/abs/2002.04464v1
- Date: Tue, 11 Feb 2020 15:15:58 GMT
- Title: Feature Importance Estimation with Self-Attention Networks
- Authors: Bla\v{z} \v{S}krlj, Sa\v{s}o D\v{z}eroski, Nada Lavra\v{c} and Matej
Petkovi\v{c}
- Abstract summary: Black-box neural network models are widely used in industry and science, yet are hard to understand and interpret.
Recently, the attention mechanism was introduced, offering insights into the inner workings of neural language models.
This paper explores the use of attention-based neural networks mechanism for estimating feature importance, as means for explaining the models learned from propositional (tabular) data.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Black-box neural network models are widely used in industry and science, yet
are hard to understand and interpret. Recently, the attention mechanism was
introduced, offering insights into the inner workings of neural language
models. This paper explores the use of attention-based neural networks
mechanism for estimating feature importance, as means for explaining the models
learned from propositional (tabular) data. Feature importance estimates,
assessed by the proposed Self-Attention Network (SAN) architecture, are
compared with the established ReliefF, Mutual Information and Random
Forest-based estimates, which are widely used in practice for model
interpretation. For the first time we conduct scale-free comparisons of feature
importance estimates across algorithms on ten real and synthetic data sets to
study the similarities and differences of the resulting feature importance
estimates, showing that SANs identify similar high-ranked features as the other
methods. We demonstrate that SANs identify feature interactions which in some
cases yield better predictive performance than the baselines, suggesting that
attention extends beyond interactions of just a few key features and detects
larger feature subsets relevant for the considered learning task.
Related papers
- Explainability of Highly Associated Fuzzy Churn Patterns in Binary Classification [21.38368444137596]
This study emphasizes the importance of identifying multivariate patterns and setting soft bounds for intuitive interpretation.
The main objective is to use a machine learning model and fuzzy-set theory with top-textitk HUIM to identify highly associated patterns of customer churn.
As a result, the study introduces an innovative approach that improves the explainability and effectiveness of customer churn prediction models.
arXiv Detail & Related papers (2024-10-21T09:44:37Z) - Statistical tuning of artificial neural network [0.0]
This study introduces methods to enhance the understanding of neural networks, focusing specifically on models with a single hidden layer.
We propose statistical tests to assess the significance of input neurons and introduce algorithms for dimensionality reduction.
This research advances the field of Explainable Artificial Intelligence by presenting robust statistical frameworks for interpreting neural networks.
arXiv Detail & Related papers (2024-09-24T19:47:03Z) - Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability [2.6708879445664584]
This paper introduces a novel approach for assessing a newly trained model's performance based on another known model.
The proposed method evaluates correlations by determining if, for each neuron in one network, there exists a neuron in the other network that produces similar output.
arXiv Detail & Related papers (2024-08-15T22:57:39Z) - Cognitive Networks and Performance Drive fMRI-Based State Classification Using DNN Models [0.0]
We employ two structurally different and complementary DNN-based models to classify individual cognitive states.
We show that despite the architectural differences, both models consistently produce a robust relationship between prediction accuracy and individual cognitive performance.
arXiv Detail & Related papers (2024-08-14T15:25:51Z) - Understanding Self-attention Mechanism via Dynamical System Perspective [58.024376086269015]
Self-attention mechanism (SAM) is widely used in various fields of artificial intelligence.
We show that intrinsic stiffness phenomenon (SP) in the high-precision solution of ordinary differential equations (ODEs) also widely exists in high-performance neural networks (NN)
We show that the SAM is also a stiffness-aware step size adaptor that can enhance the model's representational ability to measure intrinsic SP.
arXiv Detail & Related papers (2023-08-19T08:17:41Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations.
We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Variational Structured Attention Networks for Deep Visual Representation
Learning [49.80498066480928]
We propose a unified deep framework to jointly learn both spatial attention maps and channel attention in a principled manner.
Specifically, we integrate the estimation and the interaction of the attentions within a probabilistic representation learning framework.
We implement the inference rules within the neural network, thus allowing for end-to-end learning of the probabilistic and the CNN front-end parameters.
arXiv Detail & Related papers (2021-03-05T07:37:24Z) - Feature Interaction based Neural Network for Click-Through Rate
Prediction [5.095988654970358]
We propose a Feature Interaction based Neural Network (FINN) which is able to model feature interaction via a 3-dimention relation tensor.
We show that our deep FINN model outperforms other state-of-the-art deep models such as PNN and DeepFM.
It also indicates that our models can effectively learn the feature interactions, and achieve better performances in real-world datasets.
arXiv Detail & Related papers (2020-06-07T03:53:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.