Related papers: Accelerated Execution of Bayesian Neural Networks using a Single Probabilistic Forward Pass and Code Generation

Accelerated Execution of Bayesian Neural Networks using a Single Probabilistic Forward Pass and Code Generation

URL: http://arxiv.org/abs/2511.23440v1
Date: Fri, 28 Nov 2025 18:35:20 GMT
Title: Accelerated Execution of Bayesian Neural Networks using a Single Probabilistic Forward Pass and Code Generation
Authors: Bernhard Klein, Falk Selker, Hendrik Borras, Sophie Steger, Franz Pernkopf, Holger Fröning,
Abstract summary: Machine learning models perform well across domains such as diagnostics, weather forecasting, NLP, and autonomous driving.<n>Traditional neural networks often fail to detect out-of-domain (OOD) data and may output confident yet incorrect predictions.<n>Probabilistic Forward Pass (PFP) offers a highly efficient approximation to Variational Inference (SVI)<n>PFP consistently outperforms SVI in computational efficiency, achieving speedups of up to 4200x for small mini-batches.
Score: 8.330425840952488
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning models perform well across domains such as diagnostics, weather forecasting, NLP, and autonomous driving, but their limited uncertainty handling restricts use in safety-critical settings. Traditional neural networks often fail to detect out-of-domain (OOD) data and may output confident yet incorrect predictions. Bayesian neural networks (BNNs) address this by providing probabilistic estimates, but incur high computational cost because predictions require sampling weight distributions and multiple forward passes. The Probabilistic Forward Pass (PFP) offers a highly efficient approximation to Stochastic Variational Inference (SVI) by assuming Gaussian-distributed weights and activations, enabling fully analytic uncertainty propagation and replacing sampling with a single deterministic forward pass. We present an end-to-end pipeline for training, compiling, optimizing, and deploying PFP-based BNNs on embedded ARM CPUs. Using the TVM deep learning compiler, we implement a dedicated library of Gaussian-propagating operators for multilayer perceptrons and convolutional neural networks, combined with manual and automated tuning strategies. Ablation studies show that PFP consistently outperforms SVI in computational efficiency, achieving speedups of up to 4200x for small mini-batches. PFP-BNNs match SVI-BNNs on Dirty-MNIST in accuracy, uncertainty estimation, and OOD detection while greatly reducing compute cost. These results highlight the potential of combining Bayesian approximations with code generation to enable efficient BNN deployment on resource-constrained systems.

Related papers

Probabilistic Neural Networks (PNNs) with t-Distributed Outputs: Adaptive Prediction Intervals Beyond Gaussian Assumptions [2.77390041716769]
Probabilistic neural networks (PNNs) produce output distributions, enabling the construction of prediction intervals.<n>We propose t-Distributed Neural Networks (TDistNNs), which generate t-distributed outputs, parameterized by location, scale, and degrees of freedom.<n>We show that TDistNNs consistently produce narrower prediction intervals than Gaussian-based PNNs while maintaining proper coverage.
arXiv Detail & Related papers (2025-03-16T04:47:48Z)
Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA [20.629635991749808]
This paper proposes an algorithm and hardware co-design framework that can generate field-programmable gate array (FPGA)-based accelerators for efficient BayesNNs. At the algorithm level, we propose novel multi-exit dropout-based BayesNNs with reduced computational and memory overheads. At the hardware level, this paper introduces a transformation framework that can generate FPGA-based accelerators for the proposed efficient BayesNNs.
arXiv Detail & Related papers (2024-06-20T17:08:42Z)
Practical Probabilistic Model-based Deep Reinforcement Learning by Integrating Dropout Uncertainty and Trajectory Sampling [7.179313063022576]
This paper addresses the prediction stability, prediction accuracy and control capability of the current probabilistic model-based reinforcement learning (MBRL) built on neural networks. A novel approach dropout-based probabilistic ensembles with trajectory sampling (DPETS) is proposed.
arXiv Detail & Related papers (2023-09-20T06:39:19Z)
Knowing When to Stop: Delay-Adaptive Spiking Neural Network Classifiers with Reliability Guarantees [36.14499894307206]
Spiking neural networks (SNNs) process time-series data via internal event-driven neural dynamics. We introduce a novel delay-adaptive SNN-based inference methodology that provides guaranteed reliability for the decisions produced at input-dependent stopping times.
arXiv Detail & Related papers (2023-05-18T22:11:04Z)
Approximate Thompson Sampling via Epistemic Neural Networks [26.872304174606278]
Epistemic neural networks (ENNs) are designed to produce accurate joint predictive distributions. We show that ENNs serve this purpose well and illustrate how the quality of joint predictive distributions drives performance.
arXiv Detail & Related papers (2023-02-18T01:58:15Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Receptive Field-based Segmentation for Distributed CNN Inference Acceleration in Collaborative Edge Computing [93.67044879636093]
We study inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network. We propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers.
arXiv Detail & Related papers (2022-07-22T18:38:11Z)
Comparative Analysis of Interval Reachability for Robust Implicit and Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs) INNs are a class of implicit learning models that use implicit equations as layers. We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Learning to Solve the AC-OPF using Sensitivity-Informed Deep Neural Networks [52.32646357164739]
We propose a deep neural network (DNN) to solve the solutions of the optimal power flow (ACOPF) The proposed SIDNN is compatible with a broad range of OPF schemes. It can be seamlessly integrated in other learning-to-OPF schemes.
arXiv Detail & Related papers (2021-03-27T00:45:23Z)
Multi-Sample Online Learning for Spiking Neural Networks based on Generalized Expectation Maximization [42.125394498649015]
Spiking Neural Networks (SNNs) capture some of the efficiency of biological brains by processing through binary neural dynamic activations. This paper proposes to leverage multiple compartments that sample independent spiking signals while sharing synaptic weights. The key idea is to use these signals to obtain more accurate statistical estimates of the log-likelihood training criterion, as well as of its gradient.
arXiv Detail & Related papers (2021-02-05T16:39:42Z)
A Meta-Learning Approach to the Optimal Power Flow Problem Under Topology Reconfigurations [69.73803123972297]
We propose a DNN-based OPF predictor that is trained using a meta-learning (MTL) approach. The developed OPF-predictor is validated through simulations using benchmark IEEE bus systems.
arXiv Detail & Related papers (2020-12-21T17:39:51Z)
Encoding the latent posterior of Bayesian Neural Networks for uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures. We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer. Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.