Related papers: A Framework for Nonstationary Gaussian Processes with Neural Network Parameters

A Framework for Nonstationary Gaussian Processes with Neural Network Parameters

URL: http://arxiv.org/abs/2507.12262v1
Date: Wed, 16 Jul 2025 14:09:49 GMT
Title: A Framework for Nonstationary Gaussian Processes with Neural Network Parameters
Authors: Zachary James, Joseph Guinness,
Abstract summary: We propose a framework that uses nonstationary kernels whose parameters vary across the feature space, modeling these parameters as the output of a neural network.<n>Our method clearly describes the behavior of the nonstationary parameters and is compatible with approximation methods for scaling to large datasets.<n>We test a nonstationary variance and noise variant of our method on several machine learning datasets and find that it achieves better accuracy and log-score than both a stationary model and a hierarchical model approximated with variational inference.
Score: 0.8057006406834466
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gaussian processes have become a popular tool for nonparametric regression because of their flexibility and uncertainty quantification. However, they often use stationary kernels, which limit the expressiveness of the model and may be unsuitable for many datasets. We propose a framework that uses nonstationary kernels whose parameters vary across the feature space, modeling these parameters as the output of a neural network that takes the features as input. The neural network and Gaussian process are trained jointly using the chain rule to calculate derivatives. Our method clearly describes the behavior of the nonstationary parameters and is compatible with approximation methods for scaling to large datasets. It is flexible and easily adapts to different nonstationary kernels without needing to redesign the optimization procedure. Our methods are implemented with the GPyTorch library and can be readily modified. We test a nonstationary variance and noise variant of our method on several machine learning datasets and find that it achieves better accuracy and log-score than both a stationary model and a hierarchical model approximated with variational inference. Similar results are observed for a model with only nonstationary variance. We also demonstrate our approach's ability to recover the nonstationary parameters of a spatial dataset.

Related papers

Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.<n>As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z)
Neural Operator Variational Inference based on Regularized Stein Discrepancy for Deep Gaussian Processes [23.87733307119697]
We introduce Neural Operator Variational Inference (NOVI) for Deep Gaussian Processes. NOVI uses a neural generator to obtain a sampler and minimizes the Regularized Stein Discrepancy in L2 space between the generated distribution and true posterior. We demonstrate that the bias introduced by our method can be controlled by multiplying the divergence with a constant, which leads to robust error control and ensures the stability and precision of the algorithm.
arXiv Detail & Related papers (2023-09-22T06:56:35Z)
Parallel and Limited Data Voice Conversion Using Stochastic Variational Deep Kernel Learning [2.5782420501870296]
This paper proposes a voice conversion method that works with limited data. It is based on variational deep kernel learning (SVDKL) It is possible to estimate non-smooth and more complex functions.
arXiv Detail & Related papers (2023-09-08T16:32:47Z)
Efficient Large-scale Nonstationary Spatial Covariance Function Estimation Using Convolutional Neural Networks [3.5455896230714194]
We use ConvNets to derive subregions from the nonstationary data. We employ a selection mechanism to identify subregions that exhibit similar behavior to stationary fields. We assess the performance of the proposed method with synthetic and real datasets at a large scale.
arXiv Detail & Related papers (2023-06-20T12:17:46Z)
Low-rank extended Kalman filtering for online learning of neural networks from streaming data [71.97861600347959]
We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream. The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior matrix. In contrast to methods based on variational inference, our method is fully deterministic, and does not require step-size tuning.
arXiv Detail & Related papers (2023-05-31T03:48:49Z)
Faithful Heteroscedastic Regression with Neural Networks [2.2835610890984164]
Parametric methods that employ neural networks for parameter maps can capture complex relationships in the data. We make two simple modifications to optimization to produce a heteroscedastic model with mean estimates that are provably as accurate as those from its homoscedastic counterpart. Our approach provably retains the accuracy of an equally flexible mean-only model while also offering best-in-class variance calibration.
arXiv Detail & Related papers (2022-12-18T22:34:42Z)
RFFNet: Large-Scale Interpretable Kernel Methods via Random Fourier Features [3.0079490585515347]
We introduce RFFNet, a scalable method that learns the kernel relevances' on the fly via first-order optimization. We show that our approach has a small memory footprint and run-time, low prediction error, and effectively identifies relevant features. We supply users with an efficient, PyTorch-based library, that adheres to the scikit-learn standard API and code for fully reproducing our results.
arXiv Detail & Related papers (2022-11-11T18:50:34Z)
FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels [82.53569355337586]
This work offers an efficient solution to temporal point processes inference using general parametric kernels with finite support. The method's effectiveness is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG) Results show that the proposed approach leads to an improved estimation of pattern latency than the state-of-the-art.
arXiv Detail & Related papers (2022-10-10T12:35:02Z)
Off-the-grid learning of mixtures from a continuous dictionary [2.774897240515734]
We consider a general non-linear model where the signal is a finite mixture of unknown, possibly increasing, number of features issued from a continuous dictionary parameterized by a real non-linear parameter.<n>We propose an off-the-grid optimization method, that is, a method which does not use any discretization scheme on the parameter space.<n>We establish convergence rates that quantify with high probability the quality of estimation for both the linear and the non-linear parameters.
arXiv Detail & Related papers (2022-06-29T07:55:20Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability. We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections. Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.