The Representation Jensen-Shannon Divergence
- URL: http://arxiv.org/abs/2305.16446v3
- Date: Mon, 2 Oct 2023 20:48:05 GMT
- Title: The Representation Jensen-Shannon Divergence
- Authors: Jhoan K. Hoyos-Osorio, Santiago Posso-Murillo, Luis G. Sanchez-Giraldo
- Abstract summary: Statistical divergences quantify the difference between probability distributions.
In this work, we propose a divergence inspired by the Jensen-Shannon divergence which avoids the estimation of the probability density functions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Statistical divergences quantify the difference between probability
distributions, thereby allowing for multiple uses in machine-learning. However,
a fundamental challenge of these quantities is their estimation from empirical
samples since the underlying distributions of the data are usually unknown. In
this work, we propose a divergence inspired by the Jensen-Shannon divergence
which avoids the estimation of the probability density functions. Our approach
embeds the data in an reproducing kernel Hilbert space (RKHS) where we
associate data distributions with uncentered covariance operators in this
representation space. Therefore, we name this measure the representation
Jensen-Shannon divergence (RJSD). We provide an estimator from empirical
covariance matrices by explicitly mapping the data to an RKHS using Fourier
features. This estimator is flexible, scalable, differentiable, and suitable
for minibatch-based optimization problems. Additionally, we provide an
estimator based on kernel matrices without an explicit mapping to the RKHS. We
provide consistency convergence results for the proposed estimator. Moreover,
we demonstrate that this quantity is a lower bound on the Jensen-Shannon
divergence, leading to a variational approach to estimate it with theoretical
guarantees. We leverage the proposed divergence to train generative networks,
where our method mitigates mode collapse and encourages samples diversity.
Additionally, RJSD surpasses other state-of-the-art techniques in multiple
two-sample testing problems, demonstrating superior performance and reliability
in discriminating between distributions.
Related papers
- Discriminative Estimation of Total Variation Distance: A Fidelity Auditor for Generative Data [10.678533056953784]
We propose a discriminative approach to estimate the total variation (TV) distance between two distributions.
Our method quantitatively characterizes the relation between the Bayes risk in classifying two distributions and their TV distance.
We demonstrate that, with a specific choice of hypothesis class in classification, a fast convergence rate in estimating the TV distance can be achieved.
arXiv Detail & Related papers (2024-05-24T08:18:09Z) - Synthetic Tabular Data Validation: A Divergence-Based Approach [8.062368743143388]
Divergences quantify discrepancies between data distributions.
Traditional approaches calculate divergences independently for each feature.
We propose a novel approach that uses divergence estimation to overcome the limitations of marginal comparisons.
arXiv Detail & Related papers (2024-05-13T15:07:52Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate [49.97755400231656]
We establish convergence guarantees for substantially larger classes of distributions under DT diffusion processes.
We then specialize our results to a number of interesting classes of distributions with explicit parameter dependencies.
We propose a novel accelerated sampler and show that it improves the convergence rates of the corresponding regular sampler by orders of magnitude with respect to all system parameters.
arXiv Detail & Related papers (2024-02-21T16:11:47Z) - Uncertainty Quantification via Stable Distribution Propagation [60.065272548502]
We propose a new approach for propagating stable probability distributions through neural networks.
Our method is based on local linearization, which we show to be an optimal approximation in terms of total variation distance for the ReLU non-linearity.
arXiv Detail & Related papers (2024-02-13T09:40:19Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Distributed Bayesian Estimation in Sensor Networks: Consensus on
Marginal Densities [15.038649101409804]
We derive a distributed provably-correct algorithm in the functional space of probability distributions over continuous variables.
We leverage these results to obtain new distributed estimators restricted to subsets of variables observed by individual agents.
This relates to applications such as cooperative localization and federated learning, where the data collected at any agent depends on a subset of all variables of interest.
arXiv Detail & Related papers (2023-12-02T21:10:06Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - The Representation Jensen-Reny\'i Divergence [0.0]
We introduce a measure between data distributions based on operators in reproducing kernel Hilbert spaces defined by infinitely divisible kernels.
The proposed measure of divergence avoids the estimation of the probability distribution underlying the data.
arXiv Detail & Related papers (2021-12-02T19:51:52Z) - Non-Asymptotic Performance Guarantees for Neural Estimation of
$\mathsf{f}$-Divergences [22.496696555768846]
Statistical distances quantify the dissimilarity between probability distributions.
A modern method for estimating such distances from data relies on parametrizing a variational form by a neural network (NN) and optimizing it.
This paper explores this tradeoff by means of non-asymptotic error bounds, focusing on three popular choices of SDs.
arXiv Detail & Related papers (2021-03-11T19:47:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.