How Regularization Terms Make Invertible Neural Networks Bayesian Point Estimators
- URL: http://arxiv.org/abs/2510.26704v1
- Date: Thu, 30 Oct 2025 17:07:14 GMT
- Title: How Regularization Terms Make Invertible Neural Networks Bayesian Point Estimators
- Authors: Nick Heilenkötter,
- Abstract summary: Invertible neural networks are attractive for inverse problems due to their inherent stability and interpretability.<n>We introduce and analyze two regularization terms for the network training that, upon inversion of the network, recover properties of classical Bayesian point estimators.<n>Our theoretical analysis characterizes how each loss shapes both the learned forward operator and its inverse reconstruction map.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Can regularization terms in the training of invertible neural networks lead to known Bayesian point estimators in reconstruction? Invertible networks are attractive for inverse problems due to their inherent stability and interpretability. Recently, optimization strategies for invertible neural networks that approximate either a reconstruction map or the forward operator have been studied from a Bayesian perspective, but each has limitations. To address this, we introduce and analyze two regularization terms for the network training that, upon inversion of the network, recover properties of classical Bayesian point estimators: while the first can be connected to the posterior mean, the second resembles the MAP estimator. Our theoretical analysis characterizes how each loss shapes both the learned forward operator and its inverse reconstruction map. Numerical experiments support our findings and demonstrate how these loss-term regularizers introduce data-dependence in a stable and interpretable way.
Related papers
- ANCRe: Adaptive Neural Connection Reassignment for Efficient Depth Scaling [57.91760520589592]
Scaling network depth has been a central driver behind the success of modern foundation models.<n>This paper revisits the default mechanism for deepening neural networks, namely residual connections.<n>We introduce adaptive neural connection reassignment (ANCRe), a principled and lightweight framework that parameterizes and learns residual connectivities from the data.
arXiv Detail & Related papers (2026-02-09T18:54:18Z) - VIKING: Deep variational inference with stochastic projections [48.946143517489496]
Variational mean field approximations tend to struggle with contemporary overparametrized deep neural networks.<n>We propose a simple variational family that considers two independent linear subspaces of the parameter space.<n>This allows us to build a fully-correlated approximate posterior reflecting the overparametrization.
arXiv Detail & Related papers (2025-10-27T15:38:35Z) - Stability Bounds for the Unfolded Forward-Backward Algorithm [21.529323131078954]
We consider a neural network architecture designed to solve inverse problems where the degradation operator is linear and known.<n> robustness of this inversion method to input perturbations is analyzed theoretically.<n>A key novelty of our work lies in examining the robustness of the proposed network to perturbations in its bias.
arXiv Detail & Related papers (2024-12-23T11:55:41Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Regularization, early-stopping and dreaming: a Hopfield-like setup to
address generalization and overfitting [0.0]
We look for optimal network parameters by applying a gradient descent over a regularized loss function.
Within this framework, the optimal neuron-interaction matrices correspond to Hebbian kernels revised by a reiterated unlearning protocol.
arXiv Detail & Related papers (2023-08-01T15:04:30Z) - Bayesian Recurrent Units and the Forward-Backward Algorithm [91.39701446828144]
Using Bayes's theorem, we derive a unit-wise recurrence as well as a backward recursion similar to the forward-backward algorithm.
The resulting Bayesian recurrent units can be integrated as recurrent neural networks within deep learning frameworks.
Experiments on speech recognition indicate that adding the derived units at the end of state-of-the-art recurrent architectures can improve the performance at a very low cost in terms of trainable parameters.
arXiv Detail & Related papers (2022-07-21T14:00:52Z) - Entangled Residual Mappings [59.02488598557491]
We introduce entangled residual mappings to generalize the structure of the residual connections.
An entangled residual mapping replaces the identity skip connections with specialized entangled mappings.
We show that while entangled mappings can preserve the iterative refinement of features across various deep models, they influence the representation learning process in convolutional networks.
arXiv Detail & Related papers (2022-06-02T19:36:03Z) - Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent.
We show that SGD is biased towards a simple solution.
We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z) - Lifted Regression/Reconstruction Networks [17.89437720094451]
We propose lifted regression/reconstruction networks (LRRNs)
LRRNs combine lifted neural networks with a guaranteed Lipschitz continuity property for the output layer.
We analyse and numerically demonstrate applications for unsupervised and supervised learning.
arXiv Detail & Related papers (2020-05-07T13:24:46Z) - Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks [65.24701908364383]
We show that a sufficient condition for a uncertainty on a ReLU network is "to be a bit Bayesian calibrated"
We further validate these findings empirically via various standard experiments using common deep ReLU networks and Laplace approximations.
arXiv Detail & Related papers (2020-02-24T08:52:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.