Kernel Selection for Modal Linear Regression: Optimal Kernel and IRLS
Algorithm
- URL: http://arxiv.org/abs/2001.11168v1
- Date: Thu, 30 Jan 2020 03:57:07 GMT
- Title: Kernel Selection for Modal Linear Regression: Optimal Kernel and IRLS
Algorithm
- Authors: Ryoya Yamasaki, Toshiyuki Tanaka
- Abstract summary: We show that a Biweight kernel is optimal in the sense of minimizing an mean squared error of a resulting MLR parameter.
Secondly, we provide a kernel class for which algorithm iteratively reweighted least-squares algorithm (IRLS) is guaranteed to converge.
- Score: 8.571896191090744
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modal linear regression (MLR) is a method for obtaining a conditional mode
predictor as a linear model. We study kernel selection for MLR from two
perspectives: "which kernel achieves smaller error?" and "which kernel is
computationally efficient?". First, we show that a Biweight kernel is optimal
in the sense of minimizing an asymptotic mean squared error of a resulting MLR
parameter. This result is derived from our refined analysis of an asymptotic
statistical behavior of MLR. Secondly, we provide a kernel class for which
iteratively reweighted least-squares algorithm (IRLS) is guaranteed to
converge, and especially prove that IRLS with an Epanechnikov kernel terminates
in a finite number of iterations. Simulation studies empirically verified that
using a Biweight kernel provides good estimation accuracy and that using an
Epanechnikov kernel is computationally efficient. Our results improve MLR of
which existing studies often stick to a Gaussian kernel and modal EM algorithm
specialized for it, by providing guidelines of kernel selection.
Related papers
- The Optimality of Kernel Classifiers in Sobolev Space [3.3253452228326332]
This paper investigates the statistical performances of kernel classifiers.
We also propose a simple method to estimate the smoothness of $2eta(x)-1$ and apply the method to real datasets.
arXiv Detail & Related papers (2024-02-02T05:23:34Z) - Gaussian Process Regression under Computational and Epistemic
Misspecification [5.393695255603843]
In large data applications, computational costs can be reduced using low-rank or sparse approximations of the kernel.
This paper investigates the effect of such kernel approximations on the element error.
arXiv Detail & Related papers (2023-12-14T18:53:32Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Efficient Convex Algorithms for Universal Kernel Learning [50.877957471649395]
An ideal set of kernels should: admit a linear parameterization (for tractability); dense in the set of all kernels (for accuracy)
Previous algorithms for optimization of kernels were limited to classification and relied on computationally complex Semidefinite Programming (SDP) algorithms.
We propose a SVD-QCQPQP algorithm which dramatically reduces the computational complexity as compared with previous SDP-based approaches.
arXiv Detail & Related papers (2023-04-15T04:57:37Z) - Learning "best" kernels from data in Gaussian process regression. With
application to aerodynamics [0.4588028371034406]
We introduce algorithms to select/design kernels in Gaussian process regression/kriging surrogate modeling techniques.
A first class of algorithms is kernel flow, which was introduced in a context of classification in machine learning.
A second class of algorithms is called spectral kernel ridge regression, and aims at selecting a "best" kernel such that the norm of the function to be approximated is minimal.
arXiv Detail & Related papers (2022-06-03T07:50:54Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Reducing the Variance of Gaussian Process Hyperparameter Optimization
with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication.
We prove that preconditioning has an additional benefit that has been previously unexplored.
It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - A Semismooth-Newton's-Method-Based Linearization and Approximation
Approach for Kernel Support Vector Machines [1.177306187948666]
Support Vector Machines (SVMs) are among the most popular and the best performing classification algorithms.
In this paper, we propose a semismooth Newton's method based linearization approximation approach for kernel SVMs.
The advantage of the proposed approach is that it maintains low computational cost and keeps a fast convergence rate.
arXiv Detail & Related papers (2020-07-21T07:44:21Z) - Optimal Rates of Distributed Regression with Imperfect Kernels [0.0]
We study the distributed kernel regression via the divide conquer and conquer approach.
We show that the kernel ridge regression can achieve rates faster than $N-1$ in the noise free setting.
arXiv Detail & Related papers (2020-06-30T13:00:16Z) - SimpleMKKM: Simple Multiple Kernel K-means [49.500663154085586]
We propose a simple yet effective multiple kernel clustering algorithm, termed simple multiple kernel k-means (SimpleMKKM)
Our criterion is given by an intractable minimization-maximization problem in the kernel coefficient and clustering partition matrix.
We theoretically analyze the performance of SimpleMKKM in terms of its clustering generalization error.
arXiv Detail & Related papers (2020-05-11T10:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.