A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative Models
- URL: http://arxiv.org/abs/2310.05833v2
- Date: Wed, 10 Jul 2024 14:37:50 GMT
- Title: A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative Models
- Authors: Sebastian G. Gruber, Florian Buettner,
- Abstract summary: We introduce the first bias-variance-covariance decomposition for kernel scores.
We derive a kernel-based variance and entropy for uncertainty estimation.
Based on the wide applicability of kernels, we demonstrate our framework via generalization and uncertainty experiments for image, audio, and language generation.
- Score: 13.527864898609398
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative models, like large language models, are becoming increasingly relevant in our daily lives, yet a theoretical framework to assess their generalization behavior and uncertainty does not exist. Particularly, the problem of uncertainty estimation is commonly solved in an ad-hoc and task-dependent manner. For example, natural language approaches cannot be transferred to image generation. In this paper, we introduce the first bias-variance-covariance decomposition for kernel scores. This decomposition represents a theoretical framework from which we derive a kernel-based variance and entropy for uncertainty estimation. We propose unbiased and consistent estimators for each quantity which only require generated samples but not the underlying model itself. Based on the wide applicability of kernels, we demonstrate our framework via generalization and uncertainty experiments for image, audio, and language generation. Specifically, kernel entropy for uncertainty estimation is more predictive of performance on CoQA and TriviaQA question answering datasets than existing baselines and can also be applied to closed-source models.
Related papers
- Causal modelling without introducing counterfactuals or abstract distributions [7.09435109588801]
In this paper, we construe causal inference as treatment-wise predictions for finite populations where all assumptions are testable.
The new framework highlights the model-dependence of causal claims as well as the difference between statistical and scientific inference.
arXiv Detail & Related papers (2024-07-24T16:07:57Z) - Fine-Grained Dynamic Framework for Bias-Variance Joint Optimization on Data Missing Not at Random [2.8165314121189247]
In most practical applications such as recommendation systems, display advertising, and so forth, the collected data often contains missing values.
We develop a systematic fine-grained dynamic learning framework to jointly optimize bias and variance.
arXiv Detail & Related papers (2024-05-24T10:07:09Z) - Deep Evidential Learning for Bayesian Quantile Regression [3.6294895527930504]
It is desirable to have accurate uncertainty estimation from a single deterministic forward-pass model.
This paper proposes a deep Bayesian quantile regression model that can estimate the quantiles of a continuous target distribution without the Gaussian assumption.
arXiv Detail & Related papers (2023-08-21T11:42:16Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Uncertainty Estimates of Predictions via a General Bias-Variance
Decomposition [7.811916700683125]
We introduce a bias-variance decomposition for proper scores, giving rise to the Bregman Information as the variance term.
We showcase the practical relevance of this decomposition on several downstream tasks, including model ensembles and confidence regions.
arXiv Detail & Related papers (2022-10-21T21:24:37Z) - Bayesian Nonlocal Operator Regression (BNOR): A Data-Driven Learning
Framework of Nonlocal Models with Uncertainty Quantification [4.705624984585247]
We consider the problem of modeling heterogeneous materials where micro-scale dynamics and interactions affect global behavior.
We develop a Bayesian framework for uncertainty (UQ) in material response prediction when using nonlocal models.
This work is a first step towards statistical characterization of nonlocal model discrepancy in the context of homogenization.
arXiv Detail & Related papers (2022-10-06T22:37:59Z) - Dense Uncertainty Estimation via an Ensemble-based Conditional Latent
Variable Model [68.34559610536614]
We argue that the aleatoric uncertainty is an inherent attribute of the data and can only be correctly estimated with an unbiased oracle model.
We propose a new sampling and selection strategy at train time to approximate the oracle model for aleatoric uncertainty estimation.
Our results show that our solution achieves both accurate deterministic results and reliable uncertainty estimation.
arXiv Detail & Related papers (2021-11-22T08:54:10Z) - BayesIMP: Uncertainty Quantification for Causal Data Fusion [52.184885680729224]
We study the causal data fusion problem, where datasets pertaining to multiple causal graphs are combined to estimate the average treatment effect of a target variable.
We introduce a framework which combines ideas from probabilistic integration and kernel mean embeddings to represent interventional distributions in the reproducing kernel Hilbert space.
arXiv Detail & Related papers (2021-06-07T10:14:18Z) - Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware
Regression [91.3373131262391]
Uncertainty is the only certainty there is.
Traditionally, the direct regression formulation is considered and the uncertainty is modeled by modifying the output space to a certain family of probabilistic distributions.
How to model the uncertainty within the present-day technologies for regression remains an open issue.
arXiv Detail & Related papers (2021-03-25T06:56:09Z) - Causal Expectation-Maximisation [70.45873402967297]
We show that causal inference is NP-hard even in models characterised by polytree-shaped graphs.
We introduce the causal EM algorithm to reconstruct the uncertainty about the latent variables from data about categorical manifest variables.
We argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.
arXiv Detail & Related papers (2020-11-04T10:25:13Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.