Spherical Inducing Features for Orthogonally-Decoupled Gaussian
Processes
- URL: http://arxiv.org/abs/2304.14034v2
- Date: Thu, 15 Jun 2023 11:51:43 GMT
- Title: Spherical Inducing Features for Orthogonally-Decoupled Gaussian
Processes
- Authors: Louis C. Tiao, Vincent Dutordoir, Victor Picheny
- Abstract summary: Gaussian processes (GPs) are often compared unfavorably to deep neural networks (NNs) for lacking the ability to learn representations.
Recent efforts to bridge the gap between GPs and deep NNs have yielded a new class of inter-domain variational GPs in which the inducing variables correspond to hidden units of a feedforward NN.
- Score: 7.4468224549568705
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite their many desirable properties, Gaussian processes (GPs) are often
compared unfavorably to deep neural networks (NNs) for lacking the ability to
learn representations. Recent efforts to bridge the gap between GPs and deep
NNs have yielded a new class of inter-domain variational GPs in which the
inducing variables correspond to hidden units of a feedforward NN. In this
work, we examine some practical issues associated with this approach and
propose an extension that leverages the orthogonal decomposition of GPs to
mitigate these limitations. In particular, we introduce spherical inter-domain
features to construct more flexible data-dependent basis functions for both the
principal and orthogonal components of the GP approximation and show that
incorporating NN activation features under this framework not only alleviates
these shortcomings but is more scalable than alternative strategies.
Experiments on multiple benchmark datasets demonstrate the effectiveness of our
approach.
Related papers
- Joint Diffusion Processes as an Inductive Bias in Sheaf Neural Networks [14.224234978509026]
Sheaf Neural Networks (SNNs) naturally extend Graph Neural Networks (GNNs)
We propose two novel sheaf learning approaches that provide a more intuitive understanding of the involved structure maps.
In our evaluation, we show the limitations of the real-world benchmarks used so far on SNNs.
arXiv Detail & Related papers (2024-07-30T07:17:46Z) - RoPINN: Region Optimized Physics-Informed Neural Networks [66.38369833561039]
Physics-informed neural networks (PINNs) have been widely applied to solve partial differential equations (PDEs)
This paper proposes and theoretically studies a new training paradigm as region optimization.
A practical training algorithm, Region Optimized PINN (RoPINN), is seamlessly derived from this new paradigm.
arXiv Detail & Related papers (2024-05-23T09:45:57Z) - Active Learning with Fully Bayesian Neural Networks for Discontinuous and Nonstationary Data [0.0]
We introduce fully Bayesian Neural Networks (FBNNs) for active learning tasks in the'small data' regime.
FBNNs provide reliable predictive distributions, crucial for making informed decisions under uncertainty in the active learning setting.
Here, we assess the suitability and performance of FBNNs with the No-U-Turn Sampler for active learning tasks in the'small data' regime.
arXiv Detail & Related papers (2024-05-16T05:20:47Z) - Gaussian Process Neural Additive Models [3.7969209746164325]
We propose a new subclass of Neural Additive Models (NAMs) that use a single-layer neural network construction of the Gaussian process via random Fourier features.
GP-NAMs have the advantage of a convex objective function and number of trainable parameters that grows linearly with feature dimensionality.
We show that GP-NAM achieves comparable or better performance in both classification and regression tasks with a large reduction in the number of parameters.
arXiv Detail & Related papers (2024-02-19T20:29:34Z) - Neural Networks Asymptotic Behaviours for the Resolution of Inverse
Problems [0.0]
This paper presents a study of the effectiveness of Neural Network (NN) techniques for deconvolution inverse problems.
We consider NNs limits, corresponding to Gaussian Processes (GPs), where non-linearities in the parameters of the NN can be neglected.
We address the deconvolution inverse problem in the case of a quantum harmonic oscillator simulated through Monte Carlo techniques on a lattice.
arXiv Detail & Related papers (2024-02-14T17:42:24Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF)
We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF.
We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - On Connections between Regularizations for Improving DNN Robustness [67.28077776415724]
This paper analyzes regularization terms proposed recently for improving the adversarial robustness of deep neural networks (DNNs)
We study possible connections between several effective methods, including input-gradient regularization, Jacobian regularization, curvature regularization, and a cross-Lipschitz functional.
arXiv Detail & Related papers (2020-07-04T23:43:32Z) - Infinite attention: NNGP and NTK for deep attention networks [38.55012122588628]
We identify an equivalence between wide neural networks (NNs) and Gaussian processes (GPs)
We show that unlike single-head attention, which induces non-Gaussian behaviour, multi-head attention architectures behave as GPs as the number of heads tends to infinity.
We introduce new features to the Neural Tangents library allowing applications of NNGP/NTK models, with and without attention, to variable-length sequences.
arXiv Detail & Related papers (2020-06-18T13:57:01Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.