Occam's Ghost
- URL: http://arxiv.org/abs/2006.09813v1
- Date: Mon, 15 Jun 2020 20:25:09 GMT
- Title: Occam's Ghost
- Authors: Peter K\"ovesarki
- Abstract summary: Minimizing the total bit requirement of a model of a dataset favors smaller derivatives, smoother probability density function estimates and most importantly, a phase space with fewer relevant parameters.
It is also shown, how it can be applied to any smooth, non-parametric probability density estimator.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This article applies the principle of Occam's Razor to non-parametric model
building of statistical data, by finding a model with the minimal number of
bits, leading to an exceptionally effective regularization method for
probability density estimators. The idea comes from the fact that likelihood
maximization also minimizes the number of bits required to encode a dataset.
However, traditional methods overlook that the optimization of model parameters
may also inadvertently play the part in encoding data points. The article shows
how to extend the bit counting to the model parameters as well, providing the
first true measure of complexity for parametric models. Minimizing the total
bit requirement of a model of a dataset favors smaller derivatives, smoother
probability density function estimates and most importantly, a phase space with
fewer relevant parameters. In fact, it is able prune parameters and detect
features with small probability at the same time. It is also shown, how it can
be applied to any smooth, non-parametric probability density estimator.
Related papers
- Scaling Exponents Across Parameterizations and Optimizers [94.54718325264218]
We propose a new perspective on parameterization by investigating a key assumption in prior work.
Our empirical investigation includes tens of thousands of models trained with all combinations of threes.
We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work.
arXiv Detail & Related papers (2024-07-08T12:32:51Z) - Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - A Statistical Decision-Theoretical Perspective on the Two-Stage Approach
to Parameter Estimation [7.599399338954307]
Two-Stage (TS) Approach can be applied to obtain reliable parametric estimates.
We show how to apply the TS approach on models for independent and identically distributed samples.
arXiv Detail & Related papers (2022-03-31T18:19:47Z) - Learning Summary Statistics for Bayesian Inference with Autoencoders [58.720142291102135]
We use the inner dimension of deep neural network based Autoencoders as summary statistics.
To create an incentive for the encoder to encode all the parameter-related information but not the noise, we give the decoder access to explicit or implicit information that has been used to generate the training data.
arXiv Detail & Related papers (2022-01-28T12:00:31Z) - Intrinsic Dimensionality Explains the Effectiveness of Language Model
Fine-Tuning [52.624194343095304]
We argue that analyzing fine-tuning through the lens of intrinsic dimension provides us with empirical and theoretical intuitions.
We empirically show that common pre-trained models have a very low intrinsic dimension.
arXiv Detail & Related papers (2020-12-22T07:42:30Z) - A new method for parameter estimation in probabilistic models: Minimum
probability flow [26.25482738732648]
We propose a new parameter fitting method, Minimum Probability Flow (MPF), which is applicable to any parametric model.
We demonstrate parameter estimation using MPF in two cases: a continuous state space model, and an Ising spin glass.
arXiv Detail & Related papers (2020-07-17T21:19:44Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Fundamental Limits of Ridge-Regularized Empirical Risk Minimization in
High Dimensions [41.7567932118769]
Empirical Risk Minimization algorithms are widely used in a variety of estimation and prediction tasks.
In this paper, we characterize for the first time the fundamental limits on the statistical accuracy of convex ERM for inference.
arXiv Detail & Related papers (2020-06-16T04:27:38Z) - SUMO: Unbiased Estimation of Log Marginal Probability for Latent
Variable Models [80.22609163316459]
We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series.
We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost.
arXiv Detail & Related papers (2020-04-01T11:49:30Z) - Optimal statistical inference in the presence of systematic
uncertainties using neural network optimization based on binned Poisson
likelihoods with nuisance parameters [0.0]
This work presents a novel strategy to construct the dimensionality reduction with neural networks for feature engineering.
We discuss how this approach results in an estimate of the parameters of interest that is close to optimal.
arXiv Detail & Related papers (2020-03-16T13:27:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.