Max-Sliced Mutual Information
- URL: http://arxiv.org/abs/2309.16200v1
- Date: Thu, 28 Sep 2023 06:49:25 GMT
- Title: Max-Sliced Mutual Information
- Authors: Dor Tsur, Ziv Goldfeld and Kristjan Greenewald
- Abstract summary: Quantifying the dependence between high-dimensional random variables is central to statistical learning and inference.
Two classical methods are canonical correlation analysis (CCA), which identifies maximally correlated projected versions of the original variables, and Shannon's mutual information, which is a universal dependence measure.
This work proposes a middle ground in the form of a scalable information-theoretic generalization of CCA, termed max-sliced mutual information (mSMI)
- Score: 17.667315953598788
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Quantifying the dependence between high-dimensional random variables is
central to statistical learning and inference. Two classical methods are
canonical correlation analysis (CCA), which identifies maximally correlated
projected versions of the original variables, and Shannon's mutual information,
which is a universal dependence measure that also captures high-order
dependencies. However, CCA only accounts for linear dependence, which may be
insufficient for certain applications, while mutual information is often
infeasible to compute/estimate in high dimensions. This work proposes a middle
ground in the form of a scalable information-theoretic generalization of CCA,
termed max-sliced mutual information (mSMI). mSMI equals the maximal mutual
information between low-dimensional projections of the high-dimensional
variables, which reduces back to CCA in the Gaussian case. It enjoys the best
of both worlds: capturing intricate dependencies in the data while being
amenable to fast computation and scalable estimation from samples. We show that
mSMI retains favorable structural properties of Shannon's mutual information,
like variational forms and identification of independence. We then study
statistical estimation of mSMI, propose an efficiently computable neural
estimator, and couple it with formal non-asymptotic error bounds. We present
experiments that demonstrate the utility of mSMI for several tasks,
encompassing independence testing, multi-view representation learning,
algorithmic fairness, and generative modeling. We observe that mSMI
consistently outperforms competing methods with little-to-no computational
overhead.
Related papers
- Differentiable Information Bottleneck for Deterministic Multi-view Clustering [9.723389925212567]
We propose a new differentiable information bottleneck (DIB) method, which provides a deterministic and analytical MVC solution.
Specifically, we first propose to directly fit the mutual information of high-dimensional spaces by leveraging normalized kernel Gram matrix.
Then, based on the new mutual information measurement, a deterministic multi-view neural network with analytical gradients is explicitly trained to parameterize IB principle.
arXiv Detail & Related papers (2024-03-23T02:13:22Z) - Trade-off Between Dependence and Complexity for Nonparametric Learning
-- an Empirical Process Approach [10.27974860479791]
In many applications where the data exhibit temporal dependencies, the corresponding empirical processes are much less understood.
We present a general bound on the expected supremum of empirical processes under standard $beta/rho$-mixing assumptions.
We show that even under long-range dependence, it is possible to attain the same rates as in the i.i.d. setting.
arXiv Detail & Related papers (2024-01-17T05:08:37Z) - DCID: Deep Canonical Information Decomposition [84.59396326810085]
We consider the problem of identifying the signal shared between two one-dimensional target variables.
We propose ICM, an evaluation metric which can be used in the presence of ground-truth labels.
We also propose Deep Canonical Information Decomposition (DCID) - a simple, yet effective approach for learning the shared variables.
arXiv Detail & Related papers (2023-06-27T16:59:06Z) - Diffeomorphic Information Neural Estimation [2.566492438263125]
Mutual Information (MI) and Conditional Mutual Information (CMI) are multi-purpose tools from information theory.
We introduce DINE (Diffeomorphic Information Neural Estorimator)-a novel approach for estimating CMI of continuous random variables.
We show that the variables of interest can be replaced with appropriate surrogates that follow simpler distributions.
arXiv Detail & Related papers (2022-11-20T03:03:56Z) - k-Sliced Mutual Information: A Quantitative Study of Scalability with
Dimension [21.82863736290358]
We extend the original SMI definition to $k$-SMI, which considers projections to $k$-dimensional subspaces.
Using a new result on the continuity of differential entropy in the 2-Wasserstein metric, we derive sharp bounds on the error of Monte Carlo (MC)-based estimates of $k$-SMI.
We then combine the MC integrator with the neural estimation framework to provide an end-to-end $k$-SMI estimator.
arXiv Detail & Related papers (2022-06-17T03:19:55Z) - Coupled Support Tensor Machine Classification for Multimodal
Neuroimaging Data [28.705764174771936]
A Coupled Support Machine (C-STM) is built upon the latent factors estimated from the Advanced Coupled Matrix Factorization (ACMTF)
C-STM combines individual and shared latent factors with multiple kernels and estimates a maximal-margin for coupled matrix tensor data.
The classification risk of C-STM is shown to converge to the optimal Bayes risk, making it a statistically consistent rule.
arXiv Detail & Related papers (2022-01-19T16:13:09Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - Generalized Matrix Factorization: efficient algorithms for fitting
generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses.
Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets.
We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z) - Statistical control for spatio-temporal MEG/EEG source imaging with
desparsified multi-task Lasso [102.84915019938413]
Non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG) offer promise of non-invasive techniques.
The problem of source localization, or source imaging, poses however a high-dimensional statistical inference challenge.
We propose an ensemble of desparsified multi-task Lasso (ecd-MTLasso) to deal with this problem.
arXiv Detail & Related papers (2020-09-29T21:17:16Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z) - Neural Methods for Point-wise Dependency Estimation [129.93860669802046]
We focus on estimating point-wise dependency (PD), which quantitatively measures how likely two outcomes co-occur.
We demonstrate the effectiveness of our approaches in 1) MI estimation, 2) self-supervised representation learning, and 3) cross-modal retrieval task.
arXiv Detail & Related papers (2020-06-09T23:26:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.