Gaussian Processes with Skewed Laplace Spectral Mixture Kernels for
Long-term Forecasting
- URL: http://arxiv.org/abs/2011.03974v3
- Date: Sat, 2 Oct 2021 04:33:28 GMT
- Title: Gaussian Processes with Skewed Laplace Spectral Mixture Kernels for
Long-term Forecasting
- Authors: Kai Chen, Twan van Laarhoven, Elena Marchiori
- Abstract summary: Long-term forecasting involves predicting a horizon that is far ahead of the last observation.
We propose to model spectral densities using a skewed Laplace spectral mixture (SLSM) due to the skewness of its peaks, sparsity, non-smoothness, and heavy tail characteristics.
In addition, we adapt the lottery ticket method, originally developed to prune weights of a neural network, to GPs in order to automatically select the number of kernel components.
- Score: 11.729971911409637
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Long-term forecasting involves predicting a horizon that is far ahead of the
last observation. It is a problem of high practical relevance, for instance for
companies in order to decide upon expensive long-term investments. Despite the
recent progress and success of Gaussian processes (GPs) based on spectral
mixture kernels, long-term forecasting remains a challenging problem for these
kernels because they decay exponentially at large horizons. This is mainly due
to their use of a mixture of Gaussians to model spectral densities.
Characteristics of the signal important for long-term forecasting can be
unravelled by investigating the distribution of the Fourier coefficients of
(the training part of) the signal, which is non-smooth, heavy-tailed, sparse,
and skewed. The heavy tail and skewness characteristics of such distributions
in the spectral domain allow to capture long-range covariance of the signal in
the time domain. Motivated by these observations, we propose to model spectral
densities using a skewed Laplace spectral mixture (SLSM) due to the skewness of
its peaks, sparsity, non-smoothness, and heavy tail characteristics. By
applying the inverse Fourier Transform to this spectral density we obtain a new
GP kernel for long-term forecasting. In addition, we adapt the lottery ticket
method, originally developed to prune weights of a neural network, to GPs in
order to automatically select the number of kernel components. Results of
extensive experiments, including a multivariate time series, show the
beneficial effect of the proposed SLSM kernel for long-term extrapolation and
robustness to the choice of the number of mixture components.
Related papers
- Introducing Spectral Attention for Long-Range Dependency in Time Series Forecasting [8.458068118782519]
Recent linear and transformer-based forecasters have shown superior performance in time series forecasting.
They are constrained by their inherent inability to effectively address long-range dependencies in time series data.
We introduce a fast and effective Spectral Attention mechanism, which preserves temporal correlations among samples.
arXiv Detail & Related papers (2024-10-28T06:17:20Z) - Long-Term Prediction Accuracy Improvement of Data-Driven Medium-Range Global Weather Forecast [5.284452133959932]
A universal neural operator called the Spherical Harmonic Neural Operator (SHNO) is introduced to improve long-term iterative forecasts.
SHNO uses the spherical harmonic basis to mitigate distortions for spherical data and uses gated residual spectral attention (GRSA) to correct spectral bias caused by spurious correlations across different scales.
Our findings highlight the benefits and potential of SHNO to improve the accuracy of long-term prediction.
arXiv Detail & Related papers (2024-06-26T02:06:27Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - Proximal Algorithms for Accelerated Langevin Dynamics [57.08271964961975]
We develop a novel class of MCMC algorithms based on a stochastized Nesterov scheme.
We show superior performance of the proposed method over typical Langevin samplers for different models in statistics and image processing.
arXiv Detail & Related papers (2023-11-24T19:56:01Z) - Jump-Diffusion Langevin Dynamics for Multimodal Posterior Sampling [3.4483987421251516]
We investigate the performance of a hybrid Metropolis and Langevin sampling method akin to Jump Diffusion on a range of synthetic and real data.
We find that careful calibration of mixing sampling jumps with gradient based chains significantly outperforms both pure gradient-based or sampling based schemes.
arXiv Detail & Related papers (2022-11-02T17:35:04Z) - Resolving the Mixing Time of the Langevin Algorithm to its Stationary
Distribution for Log-Concave Sampling [34.66940399825547]
This paper characterizes the mixing time of the Langevin Algorithm to its stationary distribution.
We introduce a technique from the differential privacy literature to the sampling literature.
arXiv Detail & Related papers (2022-10-16T05:11:16Z) - Momentum Diminishes the Effect of Spectral Bias in Physics-Informed
Neural Networks [72.09574528342732]
Physics-informed neural network (PINN) algorithms have shown promising results in solving a wide range of problems involving partial differential equations (PDEs)
They often fail to converge to desirable solutions when the target function contains high-frequency features, due to a phenomenon known as spectral bias.
In the present work, we exploit neural tangent kernels (NTKs) to investigate the training dynamics of PINNs evolving under gradient descent with momentum (SGDM)
arXiv Detail & Related papers (2022-06-29T19:03:10Z) - Clipped Stochastic Methods for Variational Inequalities with
Heavy-Tailed Noise [64.85879194013407]
We prove the first high-probability results with logarithmic dependence on the confidence level for methods for solving monotone and structured non-monotone VIPs.
Our results match the best-known ones in the light-tails case and are novel for structured non-monotone problems.
In addition, we numerically validate that the gradient noise of many practical formulations is heavy-tailed and show that clipping improves the performance of SEG/SGDA.
arXiv Detail & Related papers (2022-06-02T15:21:55Z) - Meta-Learning for Koopman Spectral Analysis with Short Time-series [49.41640137945938]
Existing methods require long time-series for training neural networks.
We propose a meta-learning method for estimating embedding functions from unseen short time-series.
We experimentally demonstrate that the proposed method achieves better performance in terms of eigenvalue estimation and future prediction.
arXiv Detail & Related papers (2021-02-09T07:19:19Z) - Approximate Inference for Spectral Mixture Kernel [25.087829816206813]
We propose an approximate Bayesian inference for the spectral mixture kernel.
We optimize the variational parameters by applying a sampling-based variational inference to the derived evidence lower bound (ELBO) estimator.
The proposed inference combined with two strategies accelerates the convergence of the parameters and leads to better optimal parameters.
arXiv Detail & Related papers (2020-06-12T09:39:29Z) - Efficiently Sampling Functions from Gaussian Process Posteriors [76.94808614373609]
We propose an easy-to-use and general-purpose approach for fast posterior sampling.
We demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
arXiv Detail & Related papers (2020-02-21T14:03:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.