The SKIM-FA Kernel: High-Dimensional Variable Selection and Nonlinear
Interaction Discovery in Linear Time
- URL: http://arxiv.org/abs/2106.12408v1
- Date: Wed, 23 Jun 2021 13:53:36 GMT
- Title: The SKIM-FA Kernel: High-Dimensional Variable Selection and Nonlinear
Interaction Discovery in Linear Time
- Authors: Raj Agrawal and Tamara Broderick
- Abstract summary: We show how a kernel trick can reduce computation with suitable Bayesian models to O(# covariates) time for both variable selection and estimation.
Our approach outperforms existing methods used for large, high-dimensional datasets.
- Score: 26.11563787525079
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many scientific problems require identifying a small set of covariates that
are associated with a target response and estimating their effects. Often,
these effects are nonlinear and include interactions, so linear and additive
methods can lead to poor estimation and variable selection. The Bayesian
framework makes it straightforward to simultaneously express sparsity,
nonlinearity, and interactions in a hierarchical model. But, as for the few
other methods that handle this trifecta, inference is computationally
intractable - with runtime at least quadratic in the number of covariates, and
often worse. In the present work, we solve this computational bottleneck. We
first show that suitable Bayesian models can be represented as Gaussian
processes (GPs). We then demonstrate how a kernel trick can reduce computation
with these GPs to O(# covariates) time for both variable selection and
estimation. Our resulting fit corresponds to a sparse orthogonal decomposition
of the regression function in a Hilbert space (i.e., a functional ANOVA
decomposition), where interaction effects represent all variation that cannot
be explained by lower-order effects. On a variety of synthetic and real
datasets, our approach outperforms existing methods used for large,
high-dimensional datasets while remaining competitive (or being orders of
magnitude faster) in runtime.
Related papers
- Lasso with Latents: Efficient Estimation, Covariate Rescaling, and
Computational-Statistical Gaps [29.13944209460543]
We propose a natural sparse linear regression setting where strong correlations arise from unobserved latent variables.
In this setting, we analyze the problem caused by strong correlations and design a surprisingly simple fix.
The sample complexity of the resulting "rescaled Lasso" algorithm incurs (in the worst case) quadratic dependence on the sparsity of the underlying signal.
arXiv Detail & Related papers (2024-02-23T16:16:38Z) - Efficient Interpretable Nonlinear Modeling for Multiple Time Series [5.448070998907116]
This paper proposes an efficient nonlinear modeling approach for multiple time series.
It incorporates nonlinear interactions among different time-series variables.
Experimental results show that the proposed algorithm improves the identification of the support of the VAR coefficients in a parsimonious manner.
arXiv Detail & Related papers (2023-09-29T11:42:59Z) - Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - Hybrid Trilinear and Bilinear Programming for Aligning Partially
Overlapping Point Sets [85.71360365315128]
In many applications, we need algorithms which can align partially overlapping point sets are invariant to the corresponding corresponding RPM algorithm.
We first show that the objective is a cubic bound function. We then utilize the convex envelopes of trilinear and bilinear monomial transformations to derive its lower bound.
We next develop a branch-and-bound (BnB) algorithm which only branches over the transformation variables and runs efficiently.
arXiv Detail & Related papers (2021-01-19T04:24:23Z) - A Hypergradient Approach to Robust Regression without Correspondence [85.49775273716503]
We consider a variant of regression problem, where the correspondence between input and output data is not available.
Most existing methods are only applicable when the sample size is small.
We propose a new computational framework -- ROBOT -- for the shuffled regression problem.
arXiv Detail & Related papers (2020-11-30T21:47:38Z) - Probabilistic Circuits for Variational Inference in Discrete Graphical
Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z) - Sparse Gaussian Processes with Spherical Harmonic Features [14.72311048788194]
We introduce a new class of inter-domain variational Gaussian processes (GP)
Our inference scheme is comparable to variational Fourier features, but it does not suffer from the curse of dimensionality.
Our experiments show that our model is able to fit a regression model for a dataset with 6 million entries two orders of magnitude faster.
arXiv Detail & Related papers (2020-06-30T10:19:32Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Improved guarantees and a multiple-descent curve for Column Subset
Selection and the Nystr\"om method [76.73096213472897]
We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees.
Our approach leads to significantly better bounds for datasets with known rates of singular value decay.
We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
arXiv Detail & Related papers (2020-02-21T00:43:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.