Related papers: Parsimonious Inference

Parsimonious Inference

URL: http://arxiv.org/abs/2103.02165v1
Date: Wed, 3 Mar 2021 04:13:14 GMT
Title: Parsimonious Inference
Authors: Jed A. Duersch and Thomas A. Catanach
Abstract summary: Parsimonious inference is an information-theoretic formulation of inference over arbitrary architectures. Our approaches combine efficient encodings with prudent sampling strategies to construct predictive ensembles without cross-validation.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bayesian inference provides a uniquely rigorous approach to obtain principled justification for uncertainty in predictions, yet it is difficult to articulate suitably general prior belief in the machine learning context, where computational architectures are pure abstractions subject to frequent modifications by practitioners attempting to improve results. Parsimonious inference is an information-theoretic formulation of inference over arbitrary architectures that formalizes Occam's Razor; we prefer simple and sufficient explanations. Our universal hyperprior assigns plausibility to prior descriptions, encoded as sequences of symbols, by expanding on the core relationships between program length, Kolmogorov complexity, and Solomonoff's algorithmic probability. We then cast learning as information minimization over our composite change in belief when an architecture is specified, training data are observed, and model parameters are inferred. By distinguishing model complexity from prediction information, our framework also quantifies the phenomenon of memorization. Although our theory is general, it is most critical when datasets are limited, e.g. small or skewed. We develop novel algorithms for polynomial regression and random forests that are suitable for such data, as demonstrated by our experiments. Our approaches combine efficient encodings with prudent sampling strategies to construct predictive ensembles without cross-validation, thus addressing a fundamental challenge in how to efficiently obtain predictions from data.

Related papers

Minimum Description Length and Generalization Guarantees for Representation Learning [16.2444595840653]
This paper presents a framework that allows us to derive upper bounds on the generalization error of a representation learning algorithm. Rather than the mutual information between the encoder's input and the representation, our new bounds involve the "multi-letter" relative entropy. To the best knowledge of the authors, the established generalization bounds are the first of their kind for Information Bottleneck (IB) type encoders and representation learning.
arXiv Detail & Related papers (2024-02-05T18:12:28Z)
Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions. A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems. It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z)
Probabilistic Dataset Reconstruction from Interpretable Models [8.31111379034875]
We show that optimal interpretable models are often more compact and leak less information regarding their training data than greedily-built ones. Our results suggest that optimal interpretable models are often more compact and leak less information regarding their training data than greedily-built ones.
arXiv Detail & Related papers (2023-08-29T08:10:09Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
Rethinking Complex Queries on Knowledge Graphs with Neural Link Predictors [58.340159346749964]
We propose a new neural-symbolic method to support end-to-end learning using complex queries with provable reasoning capability. We develop a new dataset containing ten new types of queries with features that have never been considered. Our method outperforms previous methods significantly in the new dataset and also surpasses previous methods in the existing dataset at the same time.
arXiv Detail & Related papers (2023-04-14T11:35:35Z)
Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test. We train a variational inference model to predict the causal structure from observational/interventional data. Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z)
Hybrid Predictive Coding: Inferring, Fast and Slow [62.997667081978825]
We propose a hybrid predictive coding network that combines both iterative and amortized inference in a principled manner. We demonstrate that our model is inherently sensitive to its uncertainty and adaptively balances balances to obtain accurate beliefs using minimum computational expense.
arXiv Detail & Related papers (2022-04-05T12:52:45Z)
A Simplicity Bubble Problem in Formal-Theoretic Learning Systems [1.7996150751268578]
We show that current approaches to machine learning can always be deceived, naturally or artificially, by sufficiently large datasets. We discuss the framework and additional empirical conditions to be met in order to circumvent this deceptive phenomenon.
arXiv Detail & Related papers (2021-12-22T23:44:47Z)
Learning Output Embeddings in Structured Prediction [73.99064151691597]
A powerful and flexible approach to structured prediction consists in embedding the structured objects to be predicted into a feature space of possibly infinite dimension. A prediction in the original space is computed by solving a pre-image problem. In this work, we propose to jointly learn a finite approximation of the output embedding and the regression function into the new feature space.
arXiv Detail & Related papers (2020-07-29T09:32:53Z)
Random thoughts about Complexity, Data and Models [0.0]
Data Science and Machine learning have been growing strong for the past decade. We investigate the subtle relation between "data and models" Key issue for appraisal of relation between algorithmic complexity and algorithmic learning is to do with concepts of compressibility, determinism and predictability.
arXiv Detail & Related papers (2020-04-16T14:27:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.