Related papers: Far from Asymptopia

Far from Asymptopia

URL: http://arxiv.org/abs/2205.03343v2
Date: Thu, 30 Mar 2023 20:21:26 GMT
Title: Far from Asymptopia
Authors: Michael C. Abbott and Benjamin B. Machta
Abstract summary: Inference from limited data requires a notion of measure on parameter space, most explicit in the Bayesian framework as a prior. Here we demonstrate that Jeffreys prior, the best-known uninformative choice, introduces enormous bias when applied to typical scientific models. We present results on a principled choice of measure which avoids this issue, leading to unbiased inference in complex models.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Inference from limited data requires a notion of measure on parameter space, most explicit in the Bayesian framework as a prior. Here we demonstrate that Jeffreys prior, the best-known uninformative choice, introduces enormous bias when applied to typical scientific models. Such models have a relevant effective dimensionality much smaller than the number of microscopic parameters. Because Jeffreys prior treats all microscopic parameters equally, it is from uniform when projected onto the sub-space of relevant parameters, due to variations in the local co-volume of irrelevant directions. We present results on a principled choice of measure which avoids this issue, leading to unbiased inference in complex models. This optimal prior depends on the quantity of data to be gathered, and approaches Jeffreys prior in the asymptotic limit. However, this limit cannot be justified without an impossibly large amount of data, exponential in the number of microscopic parameters.

Related papers

Testing Hypotheses of Covariate Effects on Topics of Discourse [0.0]
We introduce an approach to topic modelling that remains tractable in the face of large text corpora.<n>This is achieved by de-emphasizing the role of parameter estimation in an underlying probabilistic model.<n>We argue that the simple, non-parametric approach advocated here is faster, more interpretable, and enjoys better inferential justification than said generative models.
arXiv Detail & Related papers (2025-06-05T20:28:49Z)
A Metropolis-Adjusted Langevin Algorithm for Sampling Jeffreys Prior [5.500172106704342]
Inference and estimation are fundamental aspects of statistics, system identification and machine learning. Jeffreys prior is an appealing uninformative prior because it offers two important benefits. We propose a general sampling scheme using the Metropolis-Adjusted Langevin Algorithm.
arXiv Detail & Related papers (2025-04-08T18:44:33Z)
Generalized Grade-of-Membership Estimation for High-dimensional Locally Dependent Data [6.626575011678484]
Mixed membership models are widely used for analyzing survey responses and population genetics data. Existing approaches, such as Bayesian MCMC inference, are not scalable and lack theoretical guarantees in high-dimensional settings. We introduce a novel and simple approach that flattens the three-way quasi-tensor into a "fat" matrix, and then perform a singular value decomposition of it to estimate parameters.
arXiv Detail & Related papers (2024-12-27T18:51:15Z)
Scaling Exponents Across Parameterizations and Optimizers [94.54718325264218]
We propose a new perspective on parameterization by investigating a key assumption in prior work. Our empirical investigation includes tens of thousands of models trained with all combinations of threes. We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work.
arXiv Detail & Related papers (2024-07-08T12:32:51Z)
Should We Learn Most Likely Functions or Parameters? [51.133793272222874]
We investigate the benefits and drawbacks of directly estimating the most likely function implied by the model and the data. We find that function-space MAP estimation can lead to flatter minima, better generalization, and improved to overfitting.
arXiv Detail & Related papers (2023-11-27T16:39:55Z)
Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood Estimation [2.53740603524637]
We develop a class of interacting particle systems for implementing a maximum marginal likelihood estimation procedure. In particular, we prove that the parameter marginal of the stationary measure of this diffusion has the form of a Gibbs measure. Using a particular rescaling, we then prove geometric ergodicity of this system and bound the discretisation error. in a manner that is uniform in time and does not increase with the number of particles.
arXiv Detail & Related papers (2023-03-23T16:50:08Z)
The Choice of Noninformative Priors for Thompson Sampling in Multiparameter Bandit Models [56.31310344616837]
Thompson sampling (TS) has been known for its outstanding empirical performance supported by theoretical guarantees across various reward models. This study explores the impact of selecting noninformative priors, offering insights into the performance of TS when dealing with new models that lack theoretical understanding.
arXiv Detail & Related papers (2023-02-28T08:42:42Z)
Kernel-based off-policy estimation without overlap: Instance optimality beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data. For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z)
A Universal Law of Robustness via Isoperimetry [1.484852576248587]
We show that smooth requires $d$ more parameters than mere, where $d$ is the ambient data dimension. We prove this universal law of robustness for any smoothly parametrized function class with size weights.
arXiv Detail & Related papers (2021-05-26T19:49:47Z)
On the minmax regret for statistical manifolds: the role of curvature [68.8204255655161]
Two-part codes and the minimum description length have been successful in delivering procedures to single out the best models. We derive a sharper expression than the standard one given by the complexity, where the scalar curvature of the Fisher information metric plays a dominant role.
arXiv Detail & Related papers (2020-07-06T17:28:19Z)
Predictive Complexity Priors [3.5547661483076998]
We propose a functional prior that is defined by comparing the model's predictions to those of a reference model. Although originally defined on the model outputs, we transfer the prior to the model parameters via a change of variables. We apply our predictive complexity prior to high-dimensional regression, reasoning over neural network depth, and sharing of statistical strength for few-shot learning.
arXiv Detail & Related papers (2020-06-18T18:39:49Z)
Fundamental Limits of Ridge-Regularized Empirical Risk Minimization in High Dimensions [41.7567932118769]
Empirical Risk Minimization algorithms are widely used in a variety of estimation and prediction tasks. In this paper, we characterize for the first time the fundamental limits on the statistical accuracy of convex ERM for inference.
arXiv Detail & Related papers (2020-06-16T04:27:38Z)
Optimal statistical inference in the presence of systematic uncertainties using neural network optimization based on binned Poisson likelihoods with nuisance parameters [0.0]
This work presents a novel strategy to construct the dimensionality reduction with neural networks for feature engineering. We discuss how this approach results in an estimate of the parameters of interest that is close to optimal.
arXiv Detail & Related papers (2020-03-16T13:27:18Z)
Implicit differentiation of Lasso-type models for hyperparameter optimization [82.73138686390514]
We introduce an efficient implicit differentiation algorithm, without matrix inversion, tailored for Lasso-type problems. Our approach scales to high-dimensional data by leveraging the sparsity of the solutions.
arXiv Detail & Related papers (2020-02-20T18:43:42Z)
Minimax Semiparametric Learning With Approximate Sparsity [3.5136198842746524]
This paper formalizes the concept of approximate model sparsity through classical semi-parametric theory.<n>We derive minimax rates for a regression slope and an average derivative, finding these bounds to be substantially larger than those in low-dimensional, semi-parametric settings.
arXiv Detail & Related papers (2019-12-27T16:13:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.