Fast approximations of the Jeffreys divergence between univariate
Gaussian mixture models via exponential polynomial densities
- URL: http://arxiv.org/abs/2107.05901v1
- Date: Tue, 13 Jul 2021 07:58:01 GMT
- Title: Fast approximations of the Jeffreys divergence between univariate
Gaussian mixture models via exponential polynomial densities
- Authors: Frank Nielsen
- Abstract summary: The Jeffreys divergence is a renown symmetrization of the statistical Kullback-Leibler which is often used in machine learning, signal processing, and information sciences.
We propose a simple yet fastarine to approximate the Jeffreys divergence between two GMMs of arbitrary number of components.
- Score: 16.069404547401373
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Jeffreys divergence is a renown symmetrization of the statistical
Kullback-Leibler divergence which is often used in machine learning, signal
processing, and information sciences. Since the Jeffreys divergence between the
ubiquitous Gaussian Mixture Models are not available in closed-form, many
techniques with various pros and cons have been proposed in the literature to
either (i) estimate, (ii) approximate, or (iii) lower and upper bound this
divergence. In this work, we propose a simple yet fast heuristic to approximate
the Jeffreys divergence between two GMMs of arbitrary number of components. The
heuristic relies on converting GMMs into pairs of dually parameterized
probability densities belonging to exponential families. In particular, we
consider Polynomial Exponential Densities, and design a goodness-of-fit
criterion to measure the dissimilarity between a GMM and a PED which is a
generalization of the Hyv\"arinen divergence. This criterion allows one to
select the orders of the PEDs to approximate the GMMs. We demonstrate
experimentally that the computational time of our heuristic improves over the
stochastic Monte Carlo estimation baseline by several orders of magnitude while
approximating reasonably well the Jeffreys divergence, specially when the
univariate mixtures have a small number of modes.
Related papers
- Heterogeneous Multi-Task Gaussian Cox Processes [61.67344039414193]
We present a novel extension of multi-task Gaussian Cox processes for modeling heterogeneous correlated tasks jointly.
A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks.
We derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters.
arXiv Detail & Related papers (2023-08-29T15:01:01Z) - Robust scalable initialization for Bayesian variational inference with
multi-modal Laplace approximations [0.0]
Variational mixtures with full-covariance structures suffer from a quadratic growth due to variational parameters with the number of parameters.
We propose a method for constructing an initial Gaussian model approximation that can be used to warm-start variational inference.
arXiv Detail & Related papers (2023-07-12T19:30:04Z) - Towards Convergence Rates for Parameter Estimation in Gaussian-gated
Mixture of Experts [40.24720443257405]
We provide a convergence analysis for maximum likelihood estimation (MLE) in the Gaussian-gated MoE model.
Our findings reveal that the MLE has distinct behaviors under two complement settings of location parameters of the Gaussian gating functions.
Notably, these behaviors can be characterized by the solvability of two different systems of equations.
arXiv Detail & Related papers (2023-05-12T16:02:19Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - On the Kullback-Leibler divergence between pairwise isotropic
Gaussian-Markov random fields [93.35534658875731]
We derive expressions for the Kullback-Leibler divergence between two pairwise isotropic Gaussian-Markov random fields.
The proposed equation allows the development of novel similarity measures in image processing and machine learning applications.
arXiv Detail & Related papers (2022-03-24T16:37:24Z) - Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient
Methods [73.35353358543507]
Gradient Descent-Ascent (SGDA) is one of the most prominent algorithms for solving min-max optimization and variational inequalities problems (VIP)
In this paper, we propose a unified convergence analysis that covers a large variety of descent-ascent methods.
We develop several new variants of SGDA such as a new variance-reduced method (L-SVRGDA), new distributed methods with compression (QSGDA, DIANA-SGDA, VR-DIANA-SGDA), and a new method with coordinate randomization (SEGA-SGDA)
arXiv Detail & Related papers (2022-02-15T09:17:39Z) - Nonparametric mixture MLEs under Gaussian-smoothed optimal transport
distance [0.39373541926236766]
We adapt the GOT framework instead of its unsmoothed counterpart to approximate the true data generating distribution.
A key step in our analysis is the establishment of a new Jackson-type approximation bound of Gaussian-convoluted Lipschitz functions.
This insight bridges existing techniques of analyzing the nonparametric MLEs and the new GOT framework.
arXiv Detail & Related papers (2021-12-04T20:05:58Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - Learning Gaussian Mixtures with Generalised Linear Models: Precise
Asymptotics in High-dimensions [79.35722941720734]
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.
We prove exacts characterising the estimator in high-dimensions via empirical risk minimisation.
We discuss how our theory can be applied beyond the scope of synthetic data.
arXiv Detail & Related papers (2021-06-07T16:53:56Z) - Likelihood Ratio Exponential Families [43.98796887171374]
We use the geometric mixture path as an exponential family of distributions to analyze the thermodynamic variational objective (TVO)
We extend these likelihood ratio exponential families to include solutions to rate-distortion (RD) optimization, the information bottleneck (IB) method, and recent rate-distortion-classification approaches.
arXiv Detail & Related papers (2020-12-31T07:13:58Z) - A similarity-based Bayesian mixture-of-experts model [0.5156484100374058]
We present a new non-parametric mixture-of-experts model for multivariate regression problems.
Using a conditionally specified model, predictions for out-of-sample inputs are based on similarities to each observed data point.
Posterior inference is performed on the parameters of the mixture as well as the distance metric.
arXiv Detail & Related papers (2020-12-03T18:08:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.