Related papers: The Representation Jensen-Shannon Divergence

The Representation Jensen-Shannon Divergence

URL: http://arxiv.org/abs/2305.16446v3
Date: Mon, 2 Oct 2023 20:48:05 GMT
Title: The Representation Jensen-Shannon Divergence
Authors: Jhoan K. Hoyos-Osorio, Santiago Posso-Murillo, Luis G. Sanchez-Giraldo
Abstract summary: Statistical divergences quantify the difference between probability distributions. In this work, we propose a divergence inspired by the Jensen-Shannon divergence which avoids the estimation of the probability density functions.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Statistical divergences quantify the difference between probability distributions, thereby allowing for multiple uses in machine-learning. However, a fundamental challenge of these quantities is their estimation from empirical samples since the underlying distributions of the data are usually unknown. In this work, we propose a divergence inspired by the Jensen-Shannon divergence which avoids the estimation of the probability density functions. Our approach embeds the data in an reproducing kernel Hilbert space (RKHS) where we associate data distributions with uncentered covariance operators in this representation space. Therefore, we name this measure the representation Jensen-Shannon divergence (RJSD). We provide an estimator from empirical covariance matrices by explicitly mapping the data to an RKHS using Fourier features. This estimator is flexible, scalable, differentiable, and suitable for minibatch-based optimization problems. Additionally, we provide an estimator based on kernel matrices without an explicit mapping to the RKHS. We provide consistency convergence results for the proposed estimator. Moreover, we demonstrate that this quantity is a lower bound on the Jensen-Shannon divergence, leading to a variational approach to estimate it with theoretical guarantees. We leverage the proposed divergence to train generative networks, where our method mitigates mode collapse and encourages samples diversity. Additionally, RJSD surpasses other state-of-the-art techniques in multiple two-sample testing problems, demonstrating superior performance and reliability in discriminating between distributions.

Related papers

Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers. We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions. This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z)
Discriminative Estimation of Total Variation Distance: A Fidelity Auditor for Generative Data [10.678533056953784]
We propose a discriminative approach to estimate the total variation (TV) distance between two distributions. Our method quantitatively characterizes the relation between the Bayes risk in classifying two distributions and their TV distance. We demonstrate that, with a specific choice of hypothesis class in classification, a fast convergence rate in estimating the TV distance can be achieved.
arXiv Detail & Related papers (2024-05-24T08:18:09Z)
A Uniform Concentration Inequality for Kernel-Based Two-Sample Statistics [4.757470449749877]
We show that these metrics can be unified under a general framework of kernel-based two-sample statistics. This paper establishes a novel uniform concentration inequality for the aforementioned kernel-based statistics. As illustrative applications, we demonstrate how these bounds facilitate the component of error bounds for procedures such as distance covariance-based dimension reduction.
arXiv Detail & Related papers (2024-05-22T22:41:56Z)
Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties. This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z)
R-divergence for Estimating Model-oriented Distribution Discrepancy [37.939239477868796]
We introduce R-divergence, designed to assess model-oriented distribution discrepancies. R-divergence learns a minimum hypothesis on the mixed data and then gauges the empirical risk difference between them. We evaluate the test power across various unsupervised and supervised tasks and find that R-divergence achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-02T11:30:49Z)
Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data. Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes. Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z)
The Representation Jensen-Reny\'i Divergence [0.0]
We introduce a measure between data distributions based on operators in reproducing kernel Hilbert spaces defined by infinitely divisible kernels. The proposed measure of divergence avoids the estimation of the probability distribution underlying the data.
arXiv Detail & Related papers (2021-12-02T19:51:52Z)
Personalized Trajectory Prediction via Distribution Discrimination [78.69458579657189]
Trarimiy prediction is confronted with the dilemma to capture the multi-modal nature of future dynamics. We present a distribution discrimination (DisDis) method to predict personalized motion patterns. Our method can be integrated with existing multi-modal predictive models as a plug-and-play module.
arXiv Detail & Related papers (2021-07-29T17:42:12Z)
Predicting with Confidence on Unseen Distributions [90.68414180153897]
We connect domain adaptation and predictive uncertainty literature to predict model accuracy on challenging unseen distributions. We find that the difference of confidences (DoC) of a classifier's predictions successfully estimates the classifier's performance change over a variety of shifts. We specifically investigate the distinction between synthetic and natural distribution shifts and observe that despite its simplicity DoC consistently outperforms other quantifications of distributional difference.
arXiv Detail & Related papers (2021-07-07T15:50:18Z)
Non-Asymptotic Performance Guarantees for Neural Estimation of $\mathsf{f}$-Divergences [22.496696555768846]
Statistical distances quantify the dissimilarity between probability distributions. A modern method for estimating such distances from data relies on parametrizing a variational form by a neural network (NN) and optimizing it. This paper explores this tradeoff by means of non-asymptotic error bounds, focusing on three popular choices of SDs.
arXiv Detail & Related papers (2021-03-11T19:47:30Z)
A Convenient Infinite Dimensional Framework for Generative Adversarial Learning [4.396860522241306]
We propose an infinite dimensional theoretical framework for generative adversarial learning. In our framework the Jensen-Shannon divergence between the distribution induced by the generator from the adversarial learning procedure and the data generating distribution converges to zero.
arXiv Detail & Related papers (2020-11-24T13:45:17Z)
GANs with Conditional Independence Graphs: On Subadditivity of Probability Divergences [70.30467057209405]
Generative Adversarial Networks (GANs) are modern methods to learn the underlying distribution of a data set. GANs are designed in a model-free fashion where no additional information about the underlying distribution is available. We propose a principled design of a model-based GAN that uses a set of simple discriminators on the neighborhoods of the Bayes-net/MRF.
arXiv Detail & Related papers (2020-03-02T04:31:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.