Fast Approximate Multi-output Gaussian Processes
- URL: http://arxiv.org/abs/2008.09848v1
- Date: Sat, 22 Aug 2020 14:34:45 GMT
- Title: Fast Approximate Multi-output Gaussian Processes
- Authors: Vladimir Joukov and Dana Kuli\'c
- Abstract summary: Training with the proposed approach requires computing only a $N times n$ eigenfunction matrix and a $n times n$ inverse where $n$ is a selected number of eigenvalues.
The proposed method can regress over multiple outputs, estimate the derivative of the regressor of any order, and learn the correlations between them.
- Score: 6.6174748514131165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gaussian processes regression models are an appealing machine learning method
as they learn expressive non-linear models from exemplar data with minimal
parameter tuning and estimate both the mean and covariance of unseen points.
However, exponential computational complexity growth with the number of
training samples has been a long standing challenge. During training, one has
to compute and invert an $N \times N$ kernel matrix at every iteration.
Regression requires computation of an $m \times N$ kernel where $N$ and $m$ are
the number of training and test points respectively. In this work we show how
approximating the covariance kernel using eigenvalues and functions leads to an
approximate Gaussian process with significant reduction in training and
regression complexity. Training with the proposed approach requires computing
only a $N \times n$ eigenfunction matrix and a $n \times n$ inverse where $n$
is a selected number of eigenvalues. Furthermore, regression now only requires
an $m \times n$ matrix. Finally, in a special case the hyperparameter
optimization is completely independent form the number of training samples. The
proposed method can regress over multiple outputs, estimate the derivative of
the regressor of any order, and learn the correlations between them. The
computational complexity reduction, regression capabilities, and multioutput
correlation learning are demonstrated in simulation examples.
Related papers
- Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators.
Key to our solution is a novel projection technique based on ideas from harmonic analysis.
Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Scale invariant process regression [0.0]
We propose a novel regression method that does not require specification of a kernel, length scale, variance, nor prior mean.
Experiments show that it is possible to derive a working machine learning method by assuming nothing but regularity and scale- and translation invariance.
arXiv Detail & Related papers (2022-08-22T17:32:33Z) - Improved Convergence Rates for Sparse Approximation Methods in
Kernel-Based Learning [48.08663378234329]
Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine learning applications.
Existing sparse approximation methods can yield a significant reduction in the computational cost.
We provide novel confidence intervals for the Nystr"om method and the sparse variational Gaussian processes approximation method.
arXiv Detail & Related papers (2022-02-08T17:22:09Z) - Efficient and robust high-dimensional sparse logistic regression via
nonlinear primal-dual hybrid gradient algorithms [0.0]
We propose an iterative algorithm that provably computes a solution to a logistic regression problem regularized by an elastic net penalty.
This result improves on the known complexity bound of $O(min(m2n,mn2)log (1/epsilon))$ for first-order optimization methods.
arXiv Detail & Related papers (2021-11-30T14:16:48Z) - Mixability made efficient: Fast online multiclass logistic regression [68.8204255655161]
We show that mixability can be a powerful tool to obtain algorithms with optimal regret.
The resulting methods often suffer from high computational complexity which has reduced their practical applicability.
arXiv Detail & Related papers (2021-10-08T08:22:05Z) - Learning to extrapolate using continued fractions: Predicting the
critical temperature of superconductor materials [5.905364646955811]
In the field of Artificial Intelligence (AI) and Machine Learning (ML), the approximation of unknown target functions $y=f(mathbfx)$ is a common objective.
We refer to $S$ as the training set and aim to identify a low-complexity mathematical model that can effectively approximate this target function for new instances $mathbfx$.
arXiv Detail & Related papers (2020-11-27T04:57:40Z) - On Function Approximation in Reinforcement Learning: Optimism in the
Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning.
In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function.
Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z) - Coordinate Methods for Matrix Games [41.821881312775496]
We develop primal-dual coordinate methods for solving bilinear saddle-point problems of the form $min_x in mathcalX max_yinmathY ytop A x$.
Our methods push existing sublinear methods towards their limits in terms of per-iteration complexity and sample complexity.
arXiv Detail & Related papers (2020-09-17T17:55:03Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Quadruply Stochastic Gaussian Processes [10.152838128195466]
We introduce a variational inference procedure for training scalable Gaussian process (GP) models whose per-iteration complexity is independent of both the number of training points, $n$, and the number basis functions used in the kernel approximation, $m$.
We demonstrate accurate inference on large classification and regression datasets using GPs and relevance vector machines with up to $m = 107$ basis functions.
arXiv Detail & Related papers (2020-06-04T17:06:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.