Function-Space Regularization in Neural Networks: A Probabilistic
Perspective
- URL: http://arxiv.org/abs/2312.17162v1
- Date: Thu, 28 Dec 2023 17:50:56 GMT
- Title: Function-Space Regularization in Neural Networks: A Probabilistic
Perspective
- Authors: Tim G. J. Rudner, Sanyam Kapoor, Shikai Qiu, Andrew Gordon Wilson
- Abstract summary: We show that we can derive a well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training.
We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection and highly-calibrated predictive uncertainty estimates.
- Score: 51.133793272222874
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Parameter-space regularization in neural network optimization is a
fundamental tool for improving generalization. However, standard
parameter-space regularization methods make it challenging to encode explicit
preferences about desired predictive functions into neural network training. In
this work, we approach regularization in neural networks from a probabilistic
perspective and show that by viewing parameter-space regularization as
specifying an empirical prior distribution over the model parameters, we can
derive a probabilistically well-motivated regularization technique that allows
explicitly encoding information about desired predictive functions into neural
network training. This method -- which we refer to as function-space empirical
Bayes (FSEB) -- includes both parameter- and function-space regularization, is
mathematically simple, easy to implement, and incurs only minimal computational
overhead compared to standard regularization techniques. We evaluate the
utility of this regularization technique empirically and demonstrate that the
proposed method leads to near-perfect semantic shift detection,
highly-calibrated predictive uncertainty estimates, successful task adaption
from pre-trained models, and improved generalization under covariate shift.
Related papers
- Sparse Deep Learning Models with the $\ell_1$ Regularization [6.268040192346312]
Sparse neural networks are highly desirable in deep learning.
We study how choices of regularization parameters influence the sparsity level of learned neural networks.
arXiv Detail & Related papers (2024-08-05T19:38:45Z) - Continual Learning via Sequential Function-Space Variational Inference [65.96686740015902]
We propose an objective derived by formulating continual learning as sequential function-space variational inference.
Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions.
We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods.
arXiv Detail & Related papers (2023-12-28T18:44:32Z) - Tractable Function-Space Variational Inference in Bayesian Neural
Networks [72.97620734290139]
A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters.
We propose a scalable function-space variational inference method that allows incorporating prior information.
We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks.
arXiv Detail & Related papers (2023-12-28T18:33:26Z) - Learning Expressive Priors for Generalization and Uncertainty Estimation
in Neural Networks [77.89179552509887]
We propose a novel prior learning method for advancing generalization and uncertainty estimation in deep neural networks.
The key idea is to exploit scalable and structured posteriors of neural networks as informative priors with generalization guarantees.
We exhaustively show the effectiveness of this method for uncertainty estimation and generalization.
arXiv Detail & Related papers (2023-07-15T09:24:33Z) - Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks.
Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - Orthogonal Stochastic Configuration Networks with Adaptive Construction
Parameter for Data Analytics [6.940097162264939]
randomness makes SCNs more likely to generate approximate linear correlative nodes that are redundant and low quality.
In light of a fundamental principle in machine learning, that is, a model with fewer parameters holds improved generalization.
This paper proposes orthogonal SCN, termed OSCN, to filtrate out the low-quality hidden nodes for network structure reduction.
arXiv Detail & Related papers (2022-05-26T07:07:26Z) - Learning Regularization Parameters of Inverse Problems via Deep Neural
Networks [0.0]
We consider a supervised learning approach, where a network is trained to approximate the mapping from observation data to regularization parameters.
We show that a wide variety of regularization functionals, forward models, and noise models may be considered.
The network-obtained regularization parameters can be computed more efficiently and may even lead to more accurate solutions.
arXiv Detail & Related papers (2021-04-14T02:38:38Z) - Revisiting Explicit Regularization in Neural Networks for
Well-Calibrated Predictive Uncertainty [6.09170287691728]
In this work, we revisit the importance of explicit regularization for obtaining well-calibrated predictive uncertainty.
We introduce a measure of calibration performance, which is lower bounded by the log-likelihood.
We then explore explicit regularization techniques for improving the log-likelihood on unseen samples, which provides well-calibrated predictive uncertainty.
arXiv Detail & Related papers (2020-06-11T13:14:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.