PSD Representations for Effective Probability Models
- URL: http://arxiv.org/abs/2106.16116v2
- Date: Thu, 1 Jul 2021 13:41:16 GMT
- Title: PSD Representations for Effective Probability Models
- Authors: Alessandro Rudi and Carlo Ciliberto
- Abstract summary: We show that a recently proposed class of positive semi-definite (PSD) models for non-negative functions is particularly suited to this end.
We characterize both approximation and generalization capabilities of PSD models, showing that they enjoy strong theoretical guarantees.
Our results open the way to applications of PSD models to density estimation, decision theory and inference.
- Score: 117.35298398434628
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Finding a good way to model probability densities is key to probabilistic
inference. An ideal model should be able to concisely approximate any
probability, while being also compatible with two main operations:
multiplications of two models (product rule) and marginalization with respect
to a subset of the random variables (sum rule). In this work, we show that a
recently proposed class of positive semi-definite (PSD) models for non-negative
functions is particularly suited to this end. In particular, we characterize
both approximation and generalization capabilities of PSD models, showing that
they enjoy strong theoretical guarantees. Moreover, we show that we can perform
efficiently both sum and product rule in closed form via matrix operations,
enjoying the same versatility of mixture models. Our results open the way to
applications of PSD models to density estimation, decision theory and
inference. Preliminary empirical evaluation supports our findings.
Related papers
- Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Out of Distribution Detection, Generalization, and Robustness Triangle
with Maximum Probability Theorem [2.0654955576087084]
MPT uses the probability distribution that the models assume on random variables to provide an upper bound on probability of the model.
We apply MPT to challenging out-of-distribution (OOD) detection problems in computer vision by incorporating MPT as a regularization scheme in training of CNNs and their energy based variants.
arXiv Detail & Related papers (2022-03-23T02:42:08Z) - Optimal regularizations for data generation with probabilistic graphical
models [0.0]
Empirically, well-chosen regularization schemes dramatically improve the quality of the inferred models.
We consider the particular case of L 2 and L 1 regularizations in the Maximum A Posteriori (MAP) inference of generative pairwise graphical models.
arXiv Detail & Related papers (2021-12-02T14:45:16Z) - Learning PSD-valued functions using kernel sums-of-squares [94.96262888797257]
We introduce a kernel sum-of-squares model for functions that take values in the PSD cone.
We show that it constitutes a universal approximator of PSD functions, and derive eigenvalue bounds in the case of subsampled equality constraints.
We then apply our results to modeling convex functions, by enforcing a kernel sum-of-squares representation of their Hessian.
arXiv Detail & Related papers (2021-11-22T16:07:50Z) - Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model.
We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z) - Loss function based second-order Jensen inequality and its application
to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution.
PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models.
We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z) - Probabilistic Generating Circuits [50.98473654244851]
We propose probabilistic generating circuits (PGCs) for their efficient representation.
PGCs are not just a theoretical framework that unifies vastly different existing models, but also show huge potential in modeling realistic data.
We exhibit a simple class of PGCs that are not trivially subsumed by simple combinations of PCs and DPPs, and obtain competitive performance on a suite of density estimation benchmarks.
arXiv Detail & Related papers (2021-02-19T07:06:53Z) - Referenced Thermodynamic Integration for Bayesian Model Selection:
Application to COVID-19 Model Selection [1.9599274203282302]
We show how to compute the ratio of two models' normalising constants, known as the Bayes factor.
In this paper we apply a variation of the TI method, referred to as referenced TI, which computes a single model's normalising constant in an efficient way.
The approach is shown to be useful in practice when applied to a real problem - to perform model selection for a semi-mechanistic hierarchical Bayesian model of COVID-19 transmission in South Korea.
arXiv Detail & Related papers (2020-09-08T16:32:06Z) - Probability Link Models with Symmetric Information Divergence [1.5749416770494706]
Two general classes of link models are proposed.
The first model links two survival functions and is applicable to models such as the proportional odds and change point.
The second model links two cumulative probability distribution functions.
arXiv Detail & Related papers (2020-08-10T19:49:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.