Parsimonious Inference
- URL: http://arxiv.org/abs/2103.02165v1
- Date: Wed, 3 Mar 2021 04:13:14 GMT
- Title: Parsimonious Inference
- Authors: Jed A. Duersch and Thomas A. Catanach
- Abstract summary: Parsimonious inference is an information-theoretic formulation of inference over arbitrary architectures.
Our approaches combine efficient encodings with prudent sampling strategies to construct predictive ensembles without cross-validation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian inference provides a uniquely rigorous approach to obtain principled
justification for uncertainty in predictions, yet it is difficult to articulate
suitably general prior belief in the machine learning context, where
computational architectures are pure abstractions subject to frequent
modifications by practitioners attempting to improve results. Parsimonious
inference is an information-theoretic formulation of inference over arbitrary
architectures that formalizes Occam's Razor; we prefer simple and sufficient
explanations. Our universal hyperprior assigns plausibility to prior
descriptions, encoded as sequences of symbols, by expanding on the core
relationships between program length, Kolmogorov complexity, and Solomonoff's
algorithmic probability. We then cast learning as information minimization over
our composite change in belief when an architecture is specified, training data
are observed, and model parameters are inferred. By distinguishing model
complexity from prediction information, our framework also quantifies the
phenomenon of memorization.
Although our theory is general, it is most critical when datasets are
limited, e.g. small or skewed. We develop novel algorithms for polynomial
regression and random forests that are suitable for such data, as demonstrated
by our experiments. Our approaches combine efficient encodings with prudent
sampling strategies to construct predictive ensembles without cross-validation,
thus addressing a fundamental challenge in how to efficiently obtain
predictions from data.
Related papers
- Minimum Description Length and Generalization Guarantees for
Representation Learning [16.2444595840653]
This paper presents a framework that allows us to derive upper bounds on the generalization error of a representation learning algorithm.
Rather than the mutual information between the encoder's input and the representation, our new bounds involve the "multi-letter" relative entropy.
To the best knowledge of the authors, the established generalization bounds are the first of their kind for Information Bottleneck (IB) type encoders and representation learning.
arXiv Detail & Related papers (2024-02-05T18:12:28Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Probabilistic Dataset Reconstruction from Interpretable Models [8.31111379034875]
We show that optimal interpretable models are often more compact and leak less information regarding their training data than greedily-built ones.
Our results suggest that optimal interpretable models are often more compact and leak less information regarding their training data than greedily-built ones.
arXiv Detail & Related papers (2023-08-29T08:10:09Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Rethinking Complex Queries on Knowledge Graphs with Neural Link Predictors [58.340159346749964]
We propose a new neural-symbolic method to support end-to-end learning using complex queries with provable reasoning capability.
We develop a new dataset containing ten new types of queries with features that have never been considered.
Our method outperforms previous methods significantly in the new dataset and also surpasses previous methods in the existing dataset at the same time.
arXiv Detail & Related papers (2023-04-14T11:35:35Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Hybrid Predictive Coding: Inferring, Fast and Slow [62.997667081978825]
We propose a hybrid predictive coding network that combines both iterative and amortized inference in a principled manner.
We demonstrate that our model is inherently sensitive to its uncertainty and adaptively balances balances to obtain accurate beliefs using minimum computational expense.
arXiv Detail & Related papers (2022-04-05T12:52:45Z) - A Simplicity Bubble Problem in Formal-Theoretic Learning Systems [1.7996150751268578]
We show that current approaches to machine learning can always be deceived, naturally or artificially, by sufficiently large datasets.
We discuss the framework and additional empirical conditions to be met in order to circumvent this deceptive phenomenon.
arXiv Detail & Related papers (2021-12-22T23:44:47Z) - Learning Output Embeddings in Structured Prediction [73.99064151691597]
A powerful and flexible approach to structured prediction consists in embedding the structured objects to be predicted into a feature space of possibly infinite dimension.
A prediction in the original space is computed by solving a pre-image problem.
In this work, we propose to jointly learn a finite approximation of the output embedding and the regression function into the new feature space.
arXiv Detail & Related papers (2020-07-29T09:32:53Z) - Random thoughts about Complexity, Data and Models [0.0]
Data Science and Machine learning have been growing strong for the past decade.
We investigate the subtle relation between "data and models"
Key issue for appraisal of relation between algorithmic complexity and algorithmic learning is to do with concepts of compressibility, determinism and predictability.
arXiv Detail & Related papers (2020-04-16T14:27:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.