Likelihood Ratio Exponential Families
- URL: http://arxiv.org/abs/2012.15480v2
- Date: Fri, 15 Jan 2021 06:06:55 GMT
- Title: Likelihood Ratio Exponential Families
- Authors: Rob Brekelmans, Frank Nielsen, Alireza Makhzani, Aram Galstyan, Greg
Ver Steeg
- Abstract summary: We use the geometric mixture path as an exponential family of distributions to analyze the thermodynamic variational objective (TVO)
We extend these likelihood ratio exponential families to include solutions to rate-distortion (RD) optimization, the information bottleneck (IB) method, and recent rate-distortion-classification approaches.
- Score: 43.98796887171374
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The exponential family is well known in machine learning and statistical
physics as the maximum entropy distribution subject to a set of observed
constraints, while the geometric mixture path is common in MCMC methods such as
annealed importance sampling. Linking these two ideas, recent work has
interpreted the geometric mixture path as an exponential family of
distributions to analyze the thermodynamic variational objective (TVO).
We extend these likelihood ratio exponential families to include solutions to
rate-distortion (RD) optimization, the information bottleneck (IB) method, and
recent rate-distortion-classification approaches which combine RD and IB. This
provides a common mathematical framework for understanding these methods via
the conjugate duality of exponential families and hypothesis testing. Further,
we collect existing results to provide a variational representation of
intermediate RD or TVO distributions as a minimizing an expectation of KL
divergences. This solution also corresponds to a size-power tradeoff using the
likelihood ratio test and the Neyman Pearson lemma. In thermodynamic
integration bounds such as the TVO, we identify the intermediate distribution
whose expected sufficient statistics match the log partition function.
Related papers
- Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Analytical Approximation of the ELBO Gradient in the Context of the Clutter Problem [0.0]
We propose an analytical solution for approximating the gradient of the Evidence Lower Bound (ELBO) in variational inference problems.
The proposed method demonstrates good accuracy and rate of convergence together with linear computational complexity.
arXiv Detail & Related papers (2024-04-16T13:19:46Z) - Statistical Mechanics of Dynamical System Identification [3.1484174280822845]
We develop a statistical mechanical approach to analyze sparse equation discovery algorithms.
In this framework, statistical mechanics offers tools to analyze the interplay between complexity and fitness.
arXiv Detail & Related papers (2024-03-04T04:32:28Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Fast approximations of the Jeffreys divergence between univariate
Gaussian mixture models via exponential polynomial densities [16.069404547401373]
The Jeffreys divergence is a renown symmetrization of the statistical Kullback-Leibler which is often used in machine learning, signal processing, and information sciences.
We propose a simple yet fastarine to approximate the Jeffreys divergence between two GMMs of arbitrary number of components.
arXiv Detail & Related papers (2021-07-13T07:58:01Z) - q-Paths: Generalizing the Geometric Annealing Path using Power Means [51.73925445218366]
We introduce $q$-paths, a family of paths which includes the geometric and arithmetic mixtures as special cases.
We show that small deviations away from the geometric path yield empirical gains for Bayesian inference.
arXiv Detail & Related papers (2021-07-01T21:09:06Z) - Recovery of Joint Probability Distribution from one-way marginals: Low
rank Tensors and Random Projections [2.9929093132587763]
Joint probability mass function (PMF) estimation is a fundamental machine learning problem.
In this work, we link random projections of data to the problem of PMF estimation using ideas from tomography.
We provide a novel algorithm for recovering factors of the tensor from one-way marginals, test it across a variety of synthetic and real-world datasets, and also perform MAP inference on the estimated model for classification.
arXiv Detail & Related papers (2021-03-22T14:00:57Z) - All in the Exponential Family: Bregman Duality in Thermodynamic
Variational Inference [42.05882835476882]
We propose an exponential family interpretation of the geometric mixture curve underlying the Thermodynamic Variational Objective (TVO)
We propose to choose intermediate distributions using equal spacing in the moment parameters of our exponential family, which matches grid search performance and allows the schedule to adaptively update over the course of training.
arXiv Detail & Related papers (2020-07-01T17:46:49Z) - A maximum-entropy approach to off-policy evaluation in average-reward
MDPs [54.967872716145656]
This work focuses on off-policy evaluation (OPE) with function approximation in infinite-horizon undiscounted Markov decision processes (MDPs)
We provide the first finite-sample OPE error bound, extending existing results beyond the episodic and discounted cases.
We show that this results in an exponential-family distribution whose sufficient statistics are the features, paralleling maximum-entropy approaches in supervised learning.
arXiv Detail & Related papers (2020-06-17T18:13:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.