Related papers: From Shallow Bayesian Neural Networks to Gaussian Processes: General Convergence, Identifiability and Scalable Inference

From Shallow Bayesian Neural Networks to Gaussian Processes: General Convergence, Identifiability and Scalable Inference

URL: http://arxiv.org/abs/2602.22492v1
Date: Thu, 26 Feb 2026 00:02:54 GMT
Title: From Shallow Bayesian Neural Networks to Gaussian Processes: General Convergence, Identifiability and Scalable Inference
Authors: Gracielle Antunes de Araújo, Flávio B. Gonçalves,
Abstract summary: We study scaling limits of shallow Bayesian neural networks (BNNs) via their connection to Gaussian processes (GPs)<n>We first establish a general convergence result from BNNs to GPs by relaxing assumptions used in prior formulations, and we compare alternative parameterizations of the limiting GP model.<n>We characterize key properties including positive definiteness and both strict and practical identifiability under different input designs.<n>For computation, we develop a scalable maximum a posterior (MAP) training and prediction procedure using a Nystrm approximation, and we show how the Nystrm rank and anchor selection control the cost-accuracy trade
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we study scaling limits of shallow Bayesian neural networks (BNNs) via their connection to Gaussian processes (GPs), with an emphasis on statistical modeling, identifiability, and scalable inference. We first establish a general convergence result from BNNs to GPs by relaxing assumptions used in prior formulations, and we compare alternative parameterizations of the limiting GP model. Building on this theory, we propose a new covariance function defined as a convex mixture of components induced by four widely used activation functions, and we characterize key properties including positive definiteness and both strict and practical identifiability under different input designs. For computation, we develop a scalable maximum a posterior (MAP) training and prediction procedure using a Nyström approximation, and we show how the Nyström rank and anchor selection control the cost-accuracy trade-off. Experiments on controlled simulations and real-world tabular datasets demonstrate stable hyperparameter estimates and competitive predictive performance at realistic computational cost.

Related papers

Kolmogorov Arnold Networks and Multi-Layer Perceptrons: A Paradigm Shift in Neural Modelling [1.6998720690708842]
The research undertakes a comprehensive comparative analysis of Kolmogorov-Arnold Networks (KAN) and Multi-Layer Perceptrons (MLP)<n>KANs utilize spline-based activation functions and grid-based structures, providing a transformative approach compared to traditional neural network frameworks.<n>The proposed study highlights the transformative capabilities of KANs in progressing intelligent systems.
arXiv Detail & Related papers (2026-01-15T16:26:49Z)
Interpretable Neural Approximation of Stochastic Reaction Dynamics with Guaranteed Reliability [4.736119820998459]
We introduce DeepSKA, a neural framework that achieves interpretability, guaranteed reliability, and substantial computational gains.<n>DeepSKA yields mathematically transparent representations that generalise across states, times, and output functions, and it integrates this structure with a small number of simulations to produce unbiased, provably convergent, and dramatically lower-magnitude estimates than classical Monte Carlo.
arXiv Detail & Related papers (2025-12-06T04:45:31Z)
Optimal Condition for Initialization Variance in Deep Neural Networks: An SGD Dynamics Perspective [0.0]
gradient descent (SGD) is one of the most fundamental optimization algorithms in machine learning (ML)<n>We study the relationship between the quasi-stationary distribution derived from this equation and the initial distribution through the Kullback-Leibler (KL) divergence.<n>We experimentally confirm our theoretical results by using the classical SGD to train fully connected neural networks on the MNIST and Fashion-MNIST datasets.
arXiv Detail & Related papers (2025-08-18T11:18:12Z)
Bayesian Neural Scaling Law Extrapolation with Prior-Data Fitted Networks [100.13335639780415]
Scaling laws often follow the power-law and proposed several variants of power-law functions to predict the scaling behavior at larger scales.<n>Existing methods mostly rely on point estimation and do not quantify uncertainty, which is crucial for real-world applications.<n>In this work, we explore a Bayesian framework based on Prior-data Fitted Networks (PFNs) for neural scaling law extrapolation.
arXiv Detail & Related papers (2025-05-29T03:19:17Z)
NBMLSS: probabilistic forecasting of electricity prices via Neural Basis Models for Location Scale and Shape [44.99833362998488]
We deploy a Neural Basis Model for Location, Scale and Shape, that blends the principled interpretability of GAMLSS with a computationally scalable shared basis decomposition.<n>Experiments have been conducted on multiple market regions, achieving probabilistic forecasting performance comparable to that of distributional neural networks.
arXiv Detail & Related papers (2024-11-21T08:17:53Z)
Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference. Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z)
Validation Diagnostics for SBI algorithms based on Normalizing Flows [55.41644538483948]
This work proposes easy to interpret validation diagnostics for multi-dimensional conditional (posterior) density estimators based on NF. It also offers theoretical guarantees based on results of local consistency. This work should help the design of better specified models or drive the development of novel SBI-algorithms.
arXiv Detail & Related papers (2022-11-17T15:48:06Z)
MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets. We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space. Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z)
Exploring the Uncertainty Properties of Neural Networks' Implicit Priors in the Infinite-Width Limit [47.324627920761685]
We use recent theoretical advances that characterize the function-space prior to an ensemble of infinitely-wide NNs as a Gaussian process. This gives us a better understanding of the implicit prior NNs place on function space. We also examine the calibration of previous approaches to classification with the NNGP.
arXiv Detail & Related papers (2020-10-14T18:41:54Z)
Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN) Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one. We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z)
Mean-Field Approximation to Gaussian-Softmax Integral with Application to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks. We use a mean-field approximation formula to compute an analytically intractable integral. Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.