Related papers: Lazy Estimation of Variable Importance for Large Neural Networks

Lazy Estimation of Variable Importance for Large Neural Networks

URL: http://arxiv.org/abs/2207.09097v1
Date: Tue, 19 Jul 2022 06:28:17 GMT
Title: Lazy Estimation of Variable Importance for Large Neural Networks
Authors: Yue Gao, Abby Stevens, Rebecca Willet, Garvesh Raskutti
Abstract summary: We propose a fast and flexible method for approximating the reduced model with important inferential guarantees. We demonstrate our method is fast and accurate under several data-generating regimes, and we demonstrate its real-world applicability on a seasonal climate forecasting example.
Score: 22.95405462638975
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As opaque predictive models increasingly impact many areas of modern life, interest in quantifying the importance of a given input variable for making a specific prediction has grown. Recently, there has been a proliferation of model-agnostic methods to measure variable importance (VI) that analyze the difference in predictive power between a full model trained on all variables and a reduced model that excludes the variable(s) of interest. A bottleneck common to these methods is the estimation of the reduced model for each variable (or subset of variables), which is an expensive process that often does not come with theoretical guarantees. In this work, we propose a fast and flexible method for approximating the reduced model with important inferential guarantees. We replace the need for fully retraining a wide neural network by a linearization initialized at the full model parameters. By adding a ridge-like penalty to make the problem convex, we prove that when the ridge penalty parameter is sufficiently large, our method estimates the variable importance measure with an error rate of $O(\frac{1}{\sqrt{n}})$ where $n$ is the number of training samples. We also show that our estimator is asymptotically normal, enabling us to provide confidence bounds for the VI estimates. We demonstrate through simulations that our method is fast and accurate under several data-generating regimes, and we demonstrate its real-world applicability on a seasonal climate forecasting example.

Related papers

Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning. By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z)
Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection. We design a forgery-style mixture formulation that augments the diversity of forgery source domains. We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z)
Quantile Regression using Random Forest Proximities [0.9423257767158634]
Quantile regression forests estimate the entire conditional distribution of the target variable with a single model. We show that using quantile regression using Random Forest proximities demonstrates superior performance in approximating conditional target distributions and prediction intervals to the original version of QRF.
arXiv Detail & Related papers (2024-08-05T10:02:33Z)
Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs. We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint. We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z)
Analysis of Bootstrap and Subsampling in High-dimensional Regularized Regression [29.57766164934947]
We investigate popular resampling methods for estimating the uncertainty of statistical models. We provide a tight description of the biases and variances estimated by these methods in the context of generalized linear models.
arXiv Detail & Related papers (2024-02-21T08:50:33Z)
Forward $\chi^2$ Divergence Based Variational Importance Sampling [2.841087763205822]
We introduce a novel variational importance sampling (VIS) approach that directly estimates and maximizes the log-likelihood. We apply VIS to various popular latent variable models, including mixture models, variational auto-encoders, and partially observable generalized linear models.
arXiv Detail & Related papers (2023-11-04T21:46:28Z)
Environment Invariant Linear Least Squares [18.387614531869826]
This paper considers a multi-environment linear regression model in which data from multiple experimental settings are collected. We construct a novel environment invariant linear least squares (EILLS) objective function, a multi-environment version of linear least-squares regression.
arXiv Detail & Related papers (2023-03-06T13:10:54Z)
MEMO: Test Time Robustness via Adaptation and Augmentation [131.28104376280197]
We study the problem of test time robustification, i.e., using the test input to improve model robustness. Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions. We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable.
arXiv Detail & Related papers (2021-10-18T17:55:11Z)
Variational Inference with NoFAS: Normalizing Flow with Adaptive Surrogate for Computationally Expensive Models [7.217783736464403]
Use of sampling-based approaches such as Markov chain Monte Carlo may become intractable when each likelihood evaluation is computationally expensive. New approaches combining variational inference with normalizing flow are characterized by a computational cost that grows only linearly with the dimensionality of the latent variable space. We propose Normalizing Flow with Adaptive Surrogate (NoFAS), an optimization strategy that alternatively updates the normalizing flow parameters and the weights of a neural network surrogate model.
arXiv Detail & Related papers (2021-08-28T14:31:45Z)
Understanding the Under-Coverage Bias in Uncertainty Estimation [58.03725169462616]
quantile regression tends to emphunder-cover than the desired coverage level in reality. We prove that quantile regression suffers from an inherent under-coverage bias. Our theory reveals that this under-coverage bias stems from a certain high-dimensional parameter estimation error.
arXiv Detail & Related papers (2021-06-10T06:11:55Z)
Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions. We investigate methods for aggregating any number of conditional quantile models. All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z)
Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations [76.85274970052762]
Regularizing distance between embeddings/representations of original samples and augmented counterparts is a popular technique for improving robustness of neural networks. In this paper, we explore these various regularization choices, seeking to provide a general understanding of how we should regularize the embeddings. We show that the generic approach we identified (squared $ell$ regularized augmentation) outperforms several recent methods, which are each specially designed for one task.
arXiv Detail & Related papers (2020-11-25T22:40:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.