Related papers: Feature Importance Estimation with Self-Attention Networks

Feature Importance Estimation with Self-Attention Networks

URL: http://arxiv.org/abs/2002.04464v1
Date: Tue, 11 Feb 2020 15:15:58 GMT
Title: Feature Importance Estimation with Self-Attention Networks
Authors: Bla\v{z} \v{S}krlj, Sa\v{s}o D\v{z}eroski, Nada Lavra\v{c} and Matej Petkovi\v{c}
Abstract summary: Black-box neural network models are widely used in industry and science, yet are hard to understand and interpret. Recently, the attention mechanism was introduced, offering insights into the inner workings of neural language models. This paper explores the use of attention-based neural networks mechanism for estimating feature importance, as means for explaining the models learned from propositional (tabular) data.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Black-box neural network models are widely used in industry and science, yet are hard to understand and interpret. Recently, the attention mechanism was introduced, offering insights into the inner workings of neural language models. This paper explores the use of attention-based neural networks mechanism for estimating feature importance, as means for explaining the models learned from propositional (tabular) data. Feature importance estimates, assessed by the proposed Self-Attention Network (SAN) architecture, are compared with the established ReliefF, Mutual Information and Random Forest-based estimates, which are widely used in practice for model interpretation. For the first time we conduct scale-free comparisons of feature importance estimates across algorithms on ten real and synthetic data sets to study the similarities and differences of the resulting feature importance estimates, showing that SANs identify similar high-ranked features as the other methods. We demonstrate that SANs identify feature interactions which in some cases yield better predictive performance than the baselines, suggesting that attention extends beyond interactions of just a few key features and detects larger feature subsets relevant for the considered learning task.

Related papers

On the Mechanistic Interpretability of Neural Networks for Causality in Bio-statistics [0.0]
Interpretable insights from predictive models remain critical in bio-statistics. Recent advances in Mechanistic Interpretability (MI) aim to decipher the internal computations learned by these networks.
arXiv Detail & Related papers (2025-05-01T14:30:34Z)
FORCE: Feature-Oriented Representation with Clustering and Explanation [0.0]
We propose a SHAP based supervised deep learning framework FORCE. It relies on two-stage usage of SHAP values in the neural network architecture. We show that FORCE led to dramatic improvements in overall performance as compared to networks that did not incorporate the latent feature and attention framework.
arXiv Detail & Related papers (2025-04-07T22:05:50Z)
Explainability of Highly Associated Fuzzy Churn Patterns in Binary Classification [21.38368444137596]
This study emphasizes the importance of identifying multivariate patterns and setting soft bounds for intuitive interpretation. The main objective is to use a machine learning model and fuzzy-set theory with top-textitk HUIM to identify highly associated patterns of customer churn. As a result, the study introduces an innovative approach that improves the explainability and effectiveness of customer churn prediction models.
arXiv Detail & Related papers (2024-10-21T09:44:37Z)
Statistical tuning of artificial neural network [0.0]
This study introduces methods to enhance the understanding of neural networks, focusing specifically on models with a single hidden layer. We propose statistical tests to assess the significance of input neurons and introduce algorithms for dimensionality reduction. This research advances the field of Explainable Artificial Intelligence by presenting robust statistical frameworks for interpreting neural networks.
arXiv Detail & Related papers (2024-09-24T19:47:03Z)
Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability [2.6708879445664584]
This paper introduces a novel approach for assessing a newly trained model's performance based on another known model. The proposed method evaluates correlations by determining if, for each neuron in one network, there exists a neuron in the other network that produces similar output.
arXiv Detail & Related papers (2024-08-15T22:57:39Z)
Cognitive Networks and Performance Drive fMRI-Based State Classification Using DNN Models [0.0]
We employ two structurally different and complementary DNN-based models to classify individual cognitive states. We show that despite the architectural differences, both models consistently produce a robust relationship between prediction accuracy and individual cognitive performance.
arXiv Detail & Related papers (2024-08-14T15:25:51Z)
Understanding Self-attention Mechanism via Dynamical System Perspective [58.024376086269015]
Self-attention mechanism (SAM) is widely used in various fields of artificial intelligence. We show that intrinsic stiffness phenomenon (SP) in the high-precision solution of ordinary differential equations (ODEs) also widely exists in high-performance neural networks (NN) We show that the SAM is also a stiffness-aware step size adaptor that can enhance the model's representational ability to measure intrinsic SP.
arXiv Detail & Related papers (2023-08-19T08:17:41Z)
On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features. Based on these observations, we propose a conceptual framework for feature learning. Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z)
Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data. Main aim of the identified model is to predict new data from previous observations. We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z)
Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations. We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z)
Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process. Our method significantly reduces the required number of interactions compared with random intervention targeting. We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z)
Variational Structured Attention Networks for Deep Visual Representation Learning [49.80498066480928]
We propose a unified deep framework to jointly learn both spatial attention maps and channel attention in a principled manner. Specifically, we integrate the estimation and the interaction of the attentions within a probabilistic representation learning framework. We implement the inference rules within the neural network, thus allowing for end-to-end learning of the probabilistic and the CNN front-end parameters.
arXiv Detail & Related papers (2021-03-05T07:37:24Z)
Feature Interaction based Neural Network for Click-Through Rate Prediction [5.095988654970358]
We propose a Feature Interaction based Neural Network (FINN) which is able to model feature interaction via a 3-dimention relation tensor. We show that our deep FINN model outperforms other state-of-the-art deep models such as PNN and DeepFM. It also indicates that our models can effectively learn the feature interactions, and achieve better performances in real-world datasets.
arXiv Detail & Related papers (2020-06-07T03:53:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.