Principled Pruning of Bayesian Neural Networks through Variational Free
Energy Minimization
- URL: http://arxiv.org/abs/2210.09134v3
- Date: Thu, 9 Nov 2023 11:48:53 GMT
- Title: Principled Pruning of Bayesian Neural Networks through Variational Free
Energy Minimization
- Authors: Jim Beckers, Bart van Erp, Ziyue Zhao, Kirill Kondrashov and Bert de
Vries
- Abstract summary: We formulate and apply Bayesian model reduction to perform principled pruning of Bayesian neural networks.
A novel iterative pruning algorithm is presented to alleviate the problems arising with naive Bayesian model reduction.
Our experiments indicate better model performance in comparison to state-of-the-art pruning schemes.
- Score: 2.3999111269325266
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Bayesian model reduction provides an efficient approach for comparing the
performance of all nested sub-models of a model, without re-evaluating any of
these sub-models. Until now, Bayesian model reduction has been applied mainly
in the computational neuroscience community on simple models. In this paper, we
formulate and apply Bayesian model reduction to perform principled pruning of
Bayesian neural networks, based on variational free energy minimization. Direct
application of Bayesian model reduction, however, gives rise to approximation
errors. Therefore, a novel iterative pruning algorithm is presented to
alleviate the problems arising with naive Bayesian model reduction, as
supported experimentally on the publicly available UCI datasets for different
inference algorithms. This novel parameter pruning scheme solves the
shortcomings of current state-of-the-art pruning methods that are used by the
signal processing community. The proposed approach has a clear stopping
criterion and minimizes the same objective that is used during training. Next
to these benefits, our experiments indicate better model performance in
comparison to state-of-the-art pruning schemes.
Related papers
- A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Adaptive operator learning for infinite-dimensional Bayesian inverse problems [7.716833952167609]
We develop an adaptive operator learning framework that can reduce modeling error gradually by forcing the surrogate to be accurate in local areas.
We present a rigorous convergence guarantee in the linear case using the UKI framework.
The numerical results show that our method can significantly reduce computational costs while maintaining inversion accuracy.
arXiv Detail & Related papers (2023-10-27T01:50:33Z) - Bayesian Cramér-Rao Bound Estimation with Score-Based Models [3.4480437706804503]
The Bayesian Cram'er-Rao bound (CRB) provides a lower bound on the mean square error of any Bayesian estimator under mild regularity conditions.
This work introduces a new data-driven estimator for the CRB using score matching.
arXiv Detail & Related papers (2023-09-28T00:22:21Z) - Improving Transferability of Adversarial Examples via Bayesian Attacks [84.90830931076901]
We introduce a novel extension by incorporating the Bayesian formulation into the model input as well, enabling the joint diversification of both the model input and model parameters.
Our method achieves a new state-of-the-art on transfer-based attacks, improving the average success rate on ImageNet and CIFAR-10 by 19.14% and 2.08%, respectively.
arXiv Detail & Related papers (2023-07-21T03:43:07Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Neural Model-based Optimization with Right-Censored Observations [42.530925002607376]
Neural networks (NNs) have been demonstrated to work well at the core of model-based optimization procedures.
We show that our trained regression models achieve a better predictive quality than several baselines.
arXiv Detail & Related papers (2020-09-29T07:32:30Z) - Deep Variational Network Toward Blind Image Restoration [60.45350399661175]
Blind image restoration is a common yet challenging problem in computer vision.
We propose a novel blind image restoration method, aiming to integrate both the advantages of them.
Experiments on two typical blind IR tasks, namely image denoising and super-resolution, demonstrate that the proposed method achieves superior performance over current state-of-the-arts.
arXiv Detail & Related papers (2020-08-25T03:30:53Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - Maximum Entropy Model Rollouts: Fast Model Based Policy Optimization
without Compounding Errors [10.906666680425754]
We propose a Dyna-style model-based reinforcement learning algorithm, which we called Maximum Entropy Model Rollouts (MEMR)
To eliminate the compounding errors, we only use our model to generate single-step rollouts.
arXiv Detail & Related papers (2020-06-08T21:38:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.