k-Sliced Mutual Information: A Quantitative Study of Scalability with
Dimension
- URL: http://arxiv.org/abs/2206.08526v1
- Date: Fri, 17 Jun 2022 03:19:55 GMT
- Title: k-Sliced Mutual Information: A Quantitative Study of Scalability with
Dimension
- Authors: Ziv Goldfeld, Kristjan Greenewald, Theshani Nuradha, Galen Reeves
- Abstract summary: We extend the original SMI definition to $k$-SMI, which considers projections to $k$-dimensional subspaces.
Using a new result on the continuity of differential entropy in the 2-Wasserstein metric, we derive sharp bounds on the error of Monte Carlo (MC)-based estimates of $k$-SMI.
We then combine the MC integrator with the neural estimation framework to provide an end-to-end $k$-SMI estimator.
- Score: 21.82863736290358
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sliced mutual information (SMI) is defined as an average of mutual
information (MI) terms between one-dimensional random projections of the random
variables. It serves as a surrogate measure of dependence to classic MI that
preserves many of its properties but is more scalable to high dimensions.
However, a quantitative characterization of how SMI itself and estimation rates
thereof depend on the ambient dimension, which is crucial to the understanding
of scalability, remain obscure. This works extends the original SMI definition
to $k$-SMI, which considers projections to $k$-dimensional subspaces, and
provides a multifaceted account on its dependence on dimension. Using a new
result on the continuity of differential entropy in the 2-Wasserstein metric,
we derive sharp bounds on the error of Monte Carlo (MC)-based estimates of
$k$-SMI, with explicit dependence on $k$ and the ambient dimension, revealing
their interplay with the number of samples. We then combine the MC integrator
with the neural estimation framework to provide an end-to-end $k$-SMI
estimator, for which optimal convergence rates are established. We also explore
asymptotics of the population $k$-SMI as dimension grows, providing Gaussian
approximation results with a residual that decays under appropriate moment
bounds. Our theory is validated with numerical experiments and is applied to
sliced InfoGAN, which altogether provide a comprehensive quantitative account
of the scalability question of $k$-SMI, including SMI as a special case when
$k=1$.
Related papers
- Anticoncentration and state design of random tensor networks [0.0]
We investigate quantum random tensor network states where the bond dimensions scalely with the system size, $N$.
For bond dimensions $chi sim gamma N$, we determine the leading order of the associated overlaps probability distribution and demonstrate its convergence to the Porter-Thomas distribution.
We extend this analysis to two-dimensional systems using randomed Project Entangled Pair States (PEPS)
arXiv Detail & Related papers (2024-09-19T18:00:28Z) - Approximating mutual information of high-dimensional variables using learned representations [1.4218223473363274]
Mutual information (MI) is a general measure of statistical dependence with widespread application across the sciences.
Existing techniques can reliably estimate MI in up to tens of dimensions, but fail in higher dimensions, where sufficient sample sizes are infeasible.
We develop latent MI (LMI) approximation, which applies a non MI estimator to low-dimensional representations learned by a simple, theoretically-motivated model architecture.
arXiv Detail & Related papers (2024-09-03T16:36:42Z) - Max-Sliced Mutual Information [17.667315953598788]
Quantifying the dependence between high-dimensional random variables is central to statistical learning and inference.
Two classical methods are canonical correlation analysis (CCA), which identifies maximally correlated projected versions of the original variables, and Shannon's mutual information, which is a universal dependence measure.
This work proposes a middle ground in the form of a scalable information-theoretic generalization of CCA, termed max-sliced mutual information (mSMI)
arXiv Detail & Related papers (2023-09-28T06:49:25Z) - DCID: Deep Canonical Information Decomposition [84.59396326810085]
We consider the problem of identifying the signal shared between two one-dimensional target variables.
We propose ICM, an evaluation metric which can be used in the presence of ground-truth labels.
We also propose Deep Canonical Information Decomposition (DCID) - a simple, yet effective approach for learning the shared variables.
arXiv Detail & Related papers (2023-06-27T16:59:06Z) - Machine learning for structure-property relationships: Scalability and
limitations [3.664479980617018]
We present a scalable machine learning (ML) framework for predicting intensive properties and particularly classifying phases of many-body systems.
Based on the locality assumption, ML model is developed for the prediction of intensive properties of a finite-size block.
We show that the applicability of this approach depends on whether the block-size of the ML model is greater than the characteristic length scale of the system.
arXiv Detail & Related papers (2023-04-11T21:17:28Z) - Mutual Wasserstein Discrepancy Minimization for Sequential
Recommendation [82.0801585843835]
We propose a novel self-supervised learning framework based on Mutual WasserStein discrepancy minimization MStein for the sequential recommendation.
We also propose a novel contrastive learning loss based on Wasserstein Discrepancy Measurement.
arXiv Detail & Related papers (2023-01-28T13:38:48Z) - A robust estimator of mutual information for deep learning
interpretability [2.574652392763709]
We present GMM-MI, an algorithm that can be applied to both discrete and continuous settings.
We extensively validate GMM-MI on toy data for which the ground truth MI is known.
We then demonstrate the use of our MI estimator in the context of representation learning.
arXiv Detail & Related papers (2022-10-31T18:00:02Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - LIFE: Learning Individual Features for Multivariate Time Series
Prediction with Missing Values [71.52335136040664]
We propose a Learning Individual Features (LIFE) framework, which provides a new paradigm for MTS prediction with missing values.
LIFE generates reliable features for prediction by using the correlated dimensions as auxiliary information and suppressing the interference from uncorrelated dimensions with missing values.
Experiments on three real-world data sets verify the superiority of LIFE to existing state-of-the-art models.
arXiv Detail & Related papers (2021-09-30T04:53:24Z) - Interpolation and Learning with Scale Dependent Kernels [91.41836461193488]
We study the learning properties of nonparametric ridge-less least squares.
We consider the common case of estimators defined by scale dependent kernels.
arXiv Detail & Related papers (2020-06-17T16:43:37Z) - Neural Methods for Point-wise Dependency Estimation [129.93860669802046]
We focus on estimating point-wise dependency (PD), which quantitatively measures how likely two outcomes co-occur.
We demonstrate the effectiveness of our approaches in 1) MI estimation, 2) self-supervised representation learning, and 3) cross-modal retrieval task.
arXiv Detail & Related papers (2020-06-09T23:26:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.