Flow-Induced Diagonal Gaussian Processes
- URL: http://arxiv.org/abs/2509.17153v2
- Date: Thu, 02 Oct 2025 18:17:08 GMT
- Title: Flow-Induced Diagonal Gaussian Processes
- Authors: Moule Lin, Andrea Patane, Weipeng Jing, Shuhao Guan, Goetz Botterweck,
- Abstract summary: Flow-Induced Diagonal Gaussian Processes (FiD-GP) is a compression framework that incorporates a compact inducing weight matrix.<n>We show how FiD-GP can help to design a single-pass projection for Out-of-Distribution (OoD) detection.
- Score: 7.720921989821054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Flow-Induced Diagonal Gaussian Processes (FiD-GP), a compression framework that incorporates a compact inducing weight matrix to project a neural network's weight uncertainty into a lower-dimensional subspace. Critically, FiD-GP relies on normalising-flow priors and spectral regularisations to augment its expressiveness and align the inducing subspace with feature-gradient geometry through a numerically stable projection mechanism objective. Furthermore, we demonstrate how the prediction framework in FiD-GP can help to design a single-pass projection for Out-of-Distribution (OoD) detection. Our analysis shows that FiD-GP improves uncertainty estimation ability on various tasks compared with SVGP-based baselines, satisfies tight spectral residual bounds with theoretically guaranteed OoD detection, and significantly compresses the neural network's storage requirements at the cost of increased inference computation dependent on the number of inducing weights employed. Specifically, in a comprehensive empirical study spanning regression, image classification, semantic segmentation, and out-of-distribution detection benchmarks, it cuts Bayesian training cost by several orders of magnitude, compresses parameters by roughly 51%, reduces model size by about 75%, and matches state-of-the-art accuracy and uncertainty estimation.
Related papers
- Structure-Informed Estimation for Pilot-Limited MIMO Channels via Tensor Decomposition [51.56484100374058]
This paper formulates pilot-limited channel estimation as low-rank tensor completion from sparse observations.<n>Experiments on synthetic channels demonstrate 10-20,dB normalized mean-square error (NMSE) improvement over least-squares (LS)<n> evaluations on DeepMIMO ray-tracing channels show 24-44% additional NMSE reduction over pure tensor-based methods.
arXiv Detail & Related papers (2026-02-03T23:38:05Z) - An End-to-End Differentiable, Graph Neural Network-Embedded Pore Network Model for Permeability Prediction [0.42970700836450487]
Pore network models (PNMs) estimate pore-scale hydraulic conductance, limiting their accuracy in complex structures.<n>We present an end-to-end differentiable hybrid framework that embeds a graph neural network (GNN) into a PNM.<n>The model achieves high accuracy and adjoins well across different scales, outperforming both pure data-driven and traditional PNM approaches.
arXiv Detail & Related papers (2025-09-17T09:15:23Z) - A Computable Measure of Suboptimality for Entropy-Regularised Variational Objectives [17.212481754312048]
Several emerging post-Bayesian methods target a probability distribution for which an entropy-regularised variational objective is minimised.<n>This increased flexibility introduces a computational challenge, as one loses access to an explicit unnormalised density for the target.<n>We introduce a novel measure of suboptimality called 'gradient discrepancy' that can be explicitly computed.
arXiv Detail & Related papers (2025-09-12T16:38:41Z) - On the Convergence of DP-SGD with Adaptive Clipping [56.24689348875711]
Gradient Descent with gradient clipping is a powerful technique for enabling differentially private optimization.<n>This paper provides the first comprehensive convergence analysis of SGD with quantile clipping (QC-SGD)<n>We show how QC-SGD suffers from a bias problem similar to constant-threshold clipped SGD but can be mitigated through a carefully designed quantile and step size schedule.
arXiv Detail & Related papers (2024-12-27T20:29:47Z) - Validation Diagnostics for SBI algorithms based on Normalizing Flows [55.41644538483948]
This work proposes easy to interpret validation diagnostics for multi-dimensional conditional (posterior) density estimators based on NF.
It also offers theoretical guarantees based on results of local consistency.
This work should help the design of better specified models or drive the development of novel SBI-algorithms.
arXiv Detail & Related papers (2022-11-17T15:48:06Z) - On the optimization and pruning for Bayesian deep learning [1.0152838128195467]
We propose a new adaptive variational Bayesian algorithm to train neural networks on weight space.
The EM-MCMC algorithm allows us to perform optimization and model pruning within one-shot.
Our dense model can reach the state-of-the-art performance and our sparse model perform very well compared to previously proposed pruning schemes.
arXiv Detail & Related papers (2022-10-24T05:18:08Z) - Stability and Generalization Analysis of Gradient Methods for Shallow
Neural Networks [59.142826407441106]
We study the generalization behavior of shallow neural networks (SNNs) by leveraging the concept of algorithmic stability.
We consider gradient descent (GD) and gradient descent (SGD) to train SNNs, for both of which we develop consistent excess bounds.
arXiv Detail & Related papers (2022-09-19T18:48:00Z) - Design of Compressed Sensing Systems via Density-Evolution Framework for
Structure Recovery in Graphical Models [10.667885727418705]
It has been shown that learning the structure of Bayesian networks from observational data is an NP-Hard problem.
We propose a novel density-evolution based framework for optimizing compressed linear measurement systems.
We show that the structure of GBN can indeed be recovered from resulting compressed measurements.
arXiv Detail & Related papers (2022-03-17T22:16:38Z) - Monocular Depth Estimation Primed by Salient Point Detection and
Normalized Hessian Loss [43.950140695759764]
We propose an accurate and lightweight framework for monocular depth estimation based on a self-attention mechanism stemming from salient point detection.
We introduce a normalized Hessian loss term invariant to scaling and shear along the depth direction, which is shown to substantially improve the accuracy.
The proposed method achieves state-of-the-art results on NYU-Depth-v2 and KITTI while using 3.1-38.4 times smaller model in terms of the number of parameters than baseline approaches.
arXiv Detail & Related papers (2021-08-25T07:51:09Z) - Precise characterization of the prior predictive distribution of deep
ReLU networks [45.46732383818331]
We derive a precise characterization of the prior predictive distribution of finite-width ReLU networks with Gaussian weights.
Our results provide valuable guidance on prior design, for instance, controlling the predictive variance with depth- and width-informed priors on the weights of the network.
arXiv Detail & Related papers (2021-06-11T21:21:52Z) - The Heavy-Tail Phenomenon in SGD [7.366405857677226]
We show that depending on the structure of the Hessian of the loss at the minimum, the SGD iterates will converge to a emphheavy-tailed stationary distribution.
We translate our results into insights about the behavior of SGD in deep learning.
arXiv Detail & Related papers (2020-06-08T16:43:56Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.