Statistical Optimality and Computational Efficiency of Nystr\"om Kernel
PCA
- URL: http://arxiv.org/abs/2105.08875v1
- Date: Wed, 19 May 2021 01:49:35 GMT
- Title: Statistical Optimality and Computational Efficiency of Nystr\"om Kernel
PCA
- Authors: Nicholas Sterge, Bharath Sriperumbudur
- Abstract summary: We study the trade-off between computational complexity and statistical accuracy in Nystr"om approximate kernel principal component analysis (KPCA)
We show that Nystr"om approximate KPCA outperforms the statistical behavior of another popular approximation scheme, the random feature approximation, when applied to KPCA.
- Score: 0.913755431537592
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Kernel methods provide an elegant framework for developing nonlinear learning
algorithms from simple linear methods. Though these methods have superior
empirical performance in several real data applications, their usefulness is
inhibited by the significant computational burden incurred in large sample
situations. Various approximation schemes have been proposed in the literature
to alleviate these computational issues, and the approximate kernel machines
are shown to retain the empirical performance. However, the theoretical
properties of these approximate kernel machines are less well understood. In
this work, we theoretically study the trade-off between computational
complexity and statistical accuracy in Nystr\"om approximate kernel principal
component analysis (KPCA), wherein we show that the Nystr\"om approximate KPCA
matches the statistical performance of (non-approximate) KPCA while remaining
computationally beneficial. Additionally, we show that Nystr\"om approximate
KPCA outperforms the statistical behavior of another popular approximation
scheme, the random feature approximation, when applied to KPCA.
Related papers
- Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - FaDIn: Fast Discretized Inference for Hawkes Processes with General
Parametric Kernels [82.53569355337586]
This work offers an efficient solution to temporal point processes inference using general parametric kernels with finite support.
The method's effectiveness is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG)
Results show that the proposed approach leads to an improved estimation of pattern latency than the state-of-the-art.
arXiv Detail & Related papers (2022-10-10T12:35:02Z) - Learnable Faster Kernel-PCA for Nonlinear Fault Detection: Deep
Autoencoder-Based Realization [7.057302509355857]
Kernel principal component analysis (KPCA) is a well-recognized nonlinear dimensionality reduction method.
In this work, a learnable faster realization of the conventional KPCA is proposed.
The proposed DAE-PCA method is proved to be equivalent to KPCA but has more advantage in terms of automatic searching of the most suitable nonlinear high-dimensional space according to the inputs.
arXiv Detail & Related papers (2021-12-08T09:41:46Z) - Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints.
We develop novel and simple optimization formulations for these problems.
As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z) - Kernel k-Means, By All Means: Algorithms and Strong Consistency [21.013169939337583]
Kernel $k$ clustering is a powerful tool for unsupervised learning of non-linear data.
In this paper, we generalize results leveraging a general family of means to combat sub-optimal local solutions.
Our algorithm makes use of majorization-minimization (MM) to better solve this non-linear separation problem.
arXiv Detail & Related papers (2020-11-12T16:07:18Z) - A Perturbation-Based Kernel Approximation Framework [0.0]
We derive a perturbation-based kernel approximation framework based on classical perturbation theory.
We show that our framework gives rise to new kernel approximation schemes, that can be tuned to take advantage of the structure of the approximated kernel matrix.
arXiv Detail & Related papers (2020-09-07T09:14:18Z) - Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics.
Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis.
We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Computing Valid p-value for Optimal Changepoint by Selective Inference
using Dynamic Programming [21.361641617994714]
We introduce a novel method to perform statistical inference on the significance of changepoints (CPs)
Based on the selective inference (SI) framework, we propose an exact (non-asymptotic) approach to compute valid p-values for testing the significance of the CPs.
We conduct experiments on both synthetic and real-world datasets, through which we offer evidence that our proposed method is more powerful than existing methods.
arXiv Detail & Related papers (2020-02-21T05:07:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.