Information-Theoretic Bayes Risk Lower Bounds for Realizable Models
- URL: http://arxiv.org/abs/2111.04579v1
- Date: Mon, 8 Nov 2021 15:42:35 GMT
- Title: Information-Theoretic Bayes Risk Lower Bounds for Realizable Models
- Authors: Matthew Nokleby and Ahmad Beirami
- Abstract summary: We derive information-theoretic lower bounds on the Bayes risk and generalization error of realizable machine learning models.
For realizable models, we show that both the rate distortion function and mutual information admit expressions that are convenient for analysis.
- Score: 11.203656874463697
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We derive information-theoretic lower bounds on the Bayes risk and
generalization error of realizable machine learning models. In particular, we
employ an analysis in which the rate-distortion function of the model
parameters bounds the required mutual information between the training samples
and the model parameters in order to learn a model up to a Bayes risk
constraint. For realizable models, we show that both the rate distortion
function and mutual information admit expressions that are convenient for
analysis. For models that are (roughly) lower Lipschitz in their parameters, we
bound the rate distortion function from below, whereas for VC classes, the
mutual information is bounded above by $d_\mathrm{vc}\log(n)$. When these
conditions match, the Bayes risk with respect to the zero-one loss scales no
faster than $\Omega(d_\mathrm{vc}/n)$, which matches known outer bounds and
minimax lower bounds up to logarithmic factors. We also consider the impact of
label noise, providing lower bounds when training and/or test samples are
corrupted.
Related papers
- Effect of Correlated Errors on Quantum Memory [1.3198143828338362]
We introduce a classical correlation model based on hidden random fields for modeling i.i.d. errors with long-range correlations.
We show that this proposed model can capture certain correlation patterns not captured by the joint (system and bath) Hamiltonian model with pairwise terms.
arXiv Detail & Related papers (2024-08-16T14:59:10Z) - Smoothly Giving up: Robustness for Simple Models [30.56684535186692]
Examples of algorithms to train such models include logistic regression and boosting.
We use $Served-Served joint convex loss functions, which tunes between canonical convex loss functions, to robustly train such models.
We also provide results for boosting a COVID-19 dataset for logistic regression, highlighting the efficacy approach across multiple relevant domains.
arXiv Detail & Related papers (2023-02-17T19:48:11Z) - How Does Pseudo-Labeling Affect the Generalization Error of the
Semi-Supervised Gibbs Algorithm? [73.80001705134147]
We provide an exact characterization of the expected generalization error (gen-error) for semi-supervised learning (SSL) with pseudo-labeling via the Gibbs algorithm.
The gen-error is expressed in terms of the symmetrized KL information between the output hypothesis, the pseudo-labeled dataset, and the labeled dataset.
arXiv Detail & Related papers (2022-10-15T04:11:56Z) - Learning Summary Statistics for Bayesian Inference with Autoencoders [58.720142291102135]
We use the inner dimension of deep neural network based Autoencoders as summary statistics.
To create an incentive for the encoder to encode all the parameter-related information but not the noise, we give the decoder access to explicit or implicit information that has been used to generate the training data.
arXiv Detail & Related papers (2022-01-28T12:00:31Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - Evaluating State-of-the-Art Classification Models Against Bayes
Optimality [106.50867011164584]
We show that we can compute the exact Bayes error of generative models learned using normalizing flows.
We use our approach to conduct a thorough investigation of state-of-the-art classification models.
arXiv Detail & Related papers (2021-06-07T06:21:20Z) - Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption.
They can suffer from ill-posedness and convergence instability.
This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z) - Rate-Distortion Analysis of Minimum Excess Risk in Bayesian Learning [15.544041797200045]
Minimum Excess Risk (MER) in Bayesian learning is defined as the difference between the minimum expected loss achievable when learning from data and the minimum expected loss that could be achieved if the underlying parameter $W$ was observed.
We derive information-theoretic bounds on the difference between these upper and lower bounds and show that they can provide order-wise tight rates for MER.
arXiv Detail & Related papers (2021-05-10T08:14:10Z) - Minimum Excess Risk in Bayesian Learning [23.681494934015927]
We analyze the best achievable performance of Bayesian learning under generative models by defining and upper-bounding the minimum excess risk (MER)
The definition of MER provides a principled way to define different notions of uncertainties in Bayesian learning.
arXiv Detail & Related papers (2020-12-29T17:41:30Z) - Probing Model Signal-Awareness via Prediction-Preserving Input
Minimization [67.62847721118142]
We evaluate models' ability to capture the correct vulnerability signals to produce their predictions.
We measure the signal awareness of models using a new metric we propose- Signal-aware Recall (SAR)
The results show a sharp drop in the model's Recall from the high 90s to sub-60s with the new metric.
arXiv Detail & Related papers (2020-11-25T20:05:23Z) - Modelling and Quantifying Membership Information Leakage in Machine
Learning [14.095523601311374]
We show that complex models, such as deep neural networks, are more susceptible to membership inference attacks.
We show that the amount of the membership information leakage is reduced by $mathcalO(log1/2(delta-1)epsilon-1)$ when using Gaussian $(epsilon,delta)$-differentially-private additive noises.
arXiv Detail & Related papers (2020-01-29T00:42:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.