Gibbs-Based Information Criteria and the Over-Parameterized Regime
- URL: http://arxiv.org/abs/2306.05583v2
- Date: Tue, 14 Nov 2023 03:09:38 GMT
- Title: Gibbs-Based Information Criteria and the Over-Parameterized Regime
- Authors: Haobo Chen, Yuheng Bu and Gregory W. Wornell
- Abstract summary: Double-descent refers to the unexpected drop in test loss of a learning algorithm beyond an interpolating threshold.
We update these analyses using the information risk minimization framework and provide Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) for models learned by the Gibbs algorithm.
- Score: 20.22034560278484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Double-descent refers to the unexpected drop in test loss of a learning
algorithm beyond an interpolating threshold with over-parameterization, which
is not predicted by information criteria in their classical forms due to the
limitations in the standard asymptotic approach. We update these analyses using
the information risk minimization framework and provide Akaike Information
Criterion (AIC) and Bayesian Information Criterion (BIC) for models learned by
the Gibbs algorithm. Notably, the penalty terms for the Gibbs-based AIC and BIC
correspond to specific information measures, i.e., symmetrized KL information
and KL divergence. We extend this information-theoretic analysis to
over-parameterized models by providing two different Gibbs-based BICs to
compute the marginal likelihood of random feature models in the regime where
the number of parameters $p$ and the number of samples $n$ tend to infinity,
with $p/n$ fixed. Our experiments demonstrate that the Gibbs-based BIC can
select the high-dimensional model and reveal the mismatch between marginal
likelihood and population risk in the over-parameterized regime, providing new
insights to understand double-descent.
Related papers
- Linear-cost unbiased posterior estimates for crossed effects and matrix factorization models via couplings [0.0]
We design and analyze unbiased Markov chain Monte Carlo schemes based on couplings of blocked Gibbs samplers (BGSs)
Our methodology is designed for and applicable to high-dimensional BGS with conditionally independent blocks.
arXiv Detail & Related papers (2024-10-11T16:05:01Z) - On uncertainty-penalized Bayesian information criterion [1.1049608786515839]
We show that using the uncertainty-penalized information criterion (UBIC) is equivalent to employing the conventional BIC.
The result indicates that the property of the UBIC and BIC holds indifferently.
arXiv Detail & Related papers (2024-04-23T13:59:11Z) - $μ$GUIDE: a framework for quantitative imaging via generalized uncertainty-driven inference using deep learning [0.0]
$mu$GUIDE estimates posterior distributions of tissue microstructure parameters from any given biophysical model or MRI signal representation.
The obtained posterior distributions allow to highlight degeneracies present in the model definition and quantify the uncertainty and ambiguity of the estimated parameters.
arXiv Detail & Related papers (2023-12-28T13:59:43Z) - Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - A Robustness Analysis of Blind Source Separation [91.3755431537592]
Blind source separation (BSS) aims to recover an unobserved signal from its mixture $X=f(S)$ under the condition that the transformation $f$ is invertible but unknown.
We present a general framework for analysing such violations and quantifying their impact on the blind recovery of $S$ from $X$.
We show that a generic BSS-solution in response to general deviations from its defining structural assumptions can be profitably analysed in the form of explicit continuity guarantees.
arXiv Detail & Related papers (2023-03-17T16:30:51Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - On Exploiting Hitting Sets for Model Reconciliation [53.81101846598925]
In human-aware planning, a planning agent may need to provide an explanation to a human user on why its plan is optimal.
A popular approach to do this is called model reconciliation, where the agent tries to reconcile the differences in its model and the human's model.
We present a logic-based framework for model reconciliation that extends beyond the realm of planning.
arXiv Detail & Related papers (2020-12-16T21:25:53Z) - Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system.
In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX)
The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z) - Novel and flexible parameter estimation methods for data-consistent
inversion in mechanistic modeling [0.13635858675752988]
We introduce new methods to solve inverse problems (SIP) based on rejection sampling, Markov chain Monte Carlo, and generative adversarial networks (GANs)
To overcome limitations of SIP, we reformulate SIP based on constrained optimization and present a novel GAN to solve the constrained optimization problem.
arXiv Detail & Related papers (2020-09-17T13:13:21Z) - On the Difference Between the Information Bottleneck and the Deep
Information Bottleneck [81.89141311906552]
We revisit the Deep Variational Information Bottleneck and the assumptions needed for its derivation.
We show how to circumvent this limitation by optimising a lower bound for $I(T;Y)$ for which only the latter Markov chain has to be satisfied.
arXiv Detail & Related papers (2019-12-31T18:31:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.