Hierarchical Non-Stationary Temporal Gaussian Processes With
$L^1$-Regularization
- URL: http://arxiv.org/abs/2105.09695v1
- Date: Thu, 20 May 2021 12:15:33 GMT
- Title: Hierarchical Non-Stationary Temporal Gaussian Processes With
$L^1$-Regularization
- Authors: Zheng Zhao, Rui Gao, Simo S\"arkk\"a
- Abstract summary: We consider two commonly used NSGP constructions which are based on explicitly constructed non-stationary covariance functions and differential equations.
We extend these NSGPs by including $L1$-regularization on the processes in order to induce sparseness.
To solve the resulting regularized NSGP (R-NSGP) regression problem we develop a method based on the alternating direction method of multipliers (ADMM)
- Score: 11.408721072077604
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper is concerned with regularized extensions of hierarchical
non-stationary temporal Gaussian processes (NSGPs) in which the parameters
(e.g., length-scale) are modeled as GPs. In particular, we consider two
commonly used NSGP constructions which are based on explicitly constructed
non-stationary covariance functions and stochastic differential equations,
respectively. We extend these NSGPs by including $L^1$-regularization on the
processes in order to induce sparseness. To solve the resulting regularized
NSGP (R-NSGP) regression problem we develop a method based on the alternating
direction method of multipliers (ADMM) and we also analyze its convergence
properties theoretically. We also evaluate the performance of the proposed
methods in simulated and real-world datasets.
Related papers
- Deep Transformed Gaussian Processes [0.0]
Transformed Gaussian Processes (TGPs) are processes specified by transforming samples from the joint distribution from a prior process (typically a GP) using an invertible transformation.
We propose a generalization of TGPs named Deep Transformed Gaussian Processes (DTGPs), which follows the trend of concatenating layers of processes.
Experiments conducted evaluate the proposed DTGPs in multiple regression datasets, achieving good scalability and performance.
arXiv Detail & Related papers (2023-10-27T16:09:39Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Gaussian Process Inference Using Mini-batch Stochastic Gradient Descent:
Convergence Guarantees and Empirical Benefits [21.353189917487512]
gradient descent (SGD) and its variants have established themselves as the go-to algorithms for machine learning problems.
We take a step forward by proving minibatch SGD converges to a critical point of the full log-likelihood loss function.
Our theoretical guarantees hold provided that the kernel functions exhibit exponential or eigendecay.
arXiv Detail & Related papers (2021-11-19T22:28:47Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Incremental Ensemble Gaussian Processes [53.3291389385672]
We propose an incremental ensemble (IE-) GP framework, where an EGP meta-learner employs an it ensemble of GP learners, each having a unique kernel belonging to a prescribed kernel dictionary.
With each GP expert leveraging the random feature-based approximation to perform online prediction and model update with it scalability, the EGP meta-learner capitalizes on data-adaptive weights to synthesize the per-expert predictions.
The novel IE-GP is generalized to accommodate time-varying functions by modeling structured dynamics at the EGP meta-learner and within each GP learner.
arXiv Detail & Related papers (2021-10-13T15:11:25Z) - Wasserstein-Splitting Gaussian Process Regression for Heterogeneous
Online Bayesian Inference [9.7471390457395]
We employ variational free energy approximations of GPs operating in tandem with online expectation propagation steps.
We introduce a local splitting step which instantiates a new GP whenever the posterior distribution changes significantly.
Over time, this yields an ensemble of sparse GPs which may be updated incrementally.
arXiv Detail & Related papers (2021-07-26T17:52:46Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - No-Regret Algorithms for Time-Varying Bayesian Optimization [0.0]
We adopt the general variation budget model to capture the time-varying environment.
We introduce two GP-UCB type algorithms: R-GP-UCB and SW-GP-UCB, respectively.
Our results not only recover previous linear bandit results when a linear kernel is used, but complement the previous regret analysis of time-varying Gaussian process bandit.
arXiv Detail & Related papers (2021-02-11T22:35:32Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z) - Sparse Orthogonal Variational Inference for Gaussian Processes [34.476453597078894]
We introduce a new interpretation of sparse variational approximations for Gaussian processes using inducing points.
We show that this formulation recovers existing approximations and at the same time allows to obtain tighter lower bounds on the marginal likelihood and new variational inference algorithms.
arXiv Detail & Related papers (2019-10-23T15:01:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.