Related papers: Parameter Averaging for Robust Explainability

Parameter Averaging for Robust Explainability

URL: http://arxiv.org/abs/2208.03249v1
Date: Fri, 5 Aug 2022 16:00:07 GMT
Title: Parameter Averaging for Robust Explainability
Authors: Talip Ucar, Ehsan Hajiramezanali
Abstract summary: We introduce a novel method based on parameter averaging for robust explainability in tabular data setting, referred as XTab. We conduct extensive experiments on a variety of real and synthetic datasets, demonstrating that the proposed method can be used for feature selection as well as to obtain the global feature importance that are not sensitive to sub-optimal model initialisation.
Score: 5.032906900691182
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural Networks are known to be sensitive to initialisation. The explanation methods that rely on neural networks are not robust since they can have variations in their explanations when the model is initialized and trained with different random seeds. The sensitivity to model initialisation is not desirable in many safety critical applications such as disease diagnosis in healthcare, in which the explainability might have a significant impact in helping decision making. In this work, we introduce a novel method based on parameter averaging for robust explainability in tabular data setting, referred as XTab. We first initialize and train multiple instances of a shallow network (referred as local masks) with different random seeds for a downstream task. We then obtain a global mask model by "averaging the parameters" of local masks and show that the global model uses the majority rule to rank features based on their relative importance across all local models. We conduct extensive experiments on a variety of real and synthetic datasets, demonstrating that the proposed method can be used for feature selection as well as to obtain the global feature importance that are not sensitive to sub-optimal model initialisation.

Related papers

NIDS Neural Networks Using Sliding Time Window Data Processing with Trainable Activations and its Generalization Capability [0.0]
This paper presents neural networks for network intrusion detection systems (NIDS) that operate on flow data preprocessed with a time window. It requires only eleven features which do not rely on deep packet inspection and can be found in most NIDS datasets and easily obtained from conventional flow collectors. The reported training accuracy exceeds 99% for the proposed method with as little as twenty neural network input features.
arXiv Detail & Related papers (2024-10-24T11:36:19Z)
Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection. We design a forgery-style mixture formulation that augments the diversity of forgery source domains. We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z)
Classification of High-dimensional Time Series in Spectral Domain using Explainable Features [8.656881800897661]
We propose a model-based approach for classifying high-dimensional stationary time series. Our approach emphasizes the interpretability of model parameters, making it especially suitable for fields like neuroscience. The novelty of our method lies in the interpretability of the model parameters, addressing critical needs in neuroscience.
arXiv Detail & Related papers (2024-08-15T19:10:12Z)
Posterior Collapse and Latent Variable Non-identifiability [54.842098835445]
We propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
arXiv Detail & Related papers (2023-01-02T06:16:56Z)
Learning Single-Index Models with Shallow Neural Networks [43.6480804626033]
We introduce a natural class of shallow neural networks and study its ability to learn single-index models via gradient flow. We show that the corresponding optimization landscape is benign, which in turn leads to generalization guarantees that match the near-optimal sample complexity of dedicated semi-parametric methods.
arXiv Detail & Related papers (2022-10-27T17:52:58Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
Look beyond labels: Incorporating functional summary information in Bayesian neural networks [11.874130244353253]
We present a simple approach to incorporate summary information about the predicted probability. The available summary information is incorporated as augmented data and modeled with a Dirichlet process. We show how the method can inform the model about task difficulty or class imbalance.
arXiv Detail & Related papers (2022-07-04T07:06:45Z)
Robust Generalization of Quadratic Neural Networks via Function Identification [19.87036824512198]
Generalization bounds from learning theory often assume that the test distribution is close to the training distribution. We show that for quadratic neural networks, we can identify the function represented by the model even though we cannot identify its parameters.
arXiv Detail & Related papers (2021-09-22T18:02:00Z)
Open Set Recognition with Conditional Probabilistic Generative Models [51.40872765917125]
We propose Conditional Probabilistic Generative Models (CPGM) for open set recognition. CPGM can detect unknown samples but also classify known classes by forcing different latent features to approximate conditional Gaussian distributions. Experiment results on multiple benchmark datasets reveal that the proposed method significantly outperforms the baselines.
arXiv Detail & Related papers (2020-08-12T06:23:49Z)
Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network. PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. The use of gradient combined nonvolutionity renders learning susceptible to novel problems. We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.