Related papers: DiME: Maximizing Mutual Information by a Difference of Matrix-Based Entropies

DiME: Maximizing Mutual Information by a Difference of Matrix-Based Entropies

URL: http://arxiv.org/abs/2301.08164v3
Date: Thu, 27 Jul 2023 18:26:45 GMT
Title: DiME: Maximizing Mutual Information by a Difference of Matrix-Based Entropies
Authors: Oscar Skean, Jhoan Keider Hoyos Osorio, Austin J. Brockmeier, Luis Gonzalo Sanchez Giraldo
Abstract summary: We introduce an information-theoretic quantity with similar properties to mutual information that can be estimated from data. We show that a difference of a matrix-based entropies (DiME) is well suited for problems involving the reproducing of mutual information between random variables. We provide examples of use cases for DiME, such as latent factor disentanglement and a multiview representation learning problem.
Score: 0.9053163124987534
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce an information-theoretic quantity with similar properties to mutual information that can be estimated from data without making explicit assumptions on the underlying distribution. This quantity is based on a recently proposed matrix-based entropy that uses the eigenvalues of a normalized Gram matrix to compute an estimate of the eigenvalues of an uncentered covariance operator in a reproducing kernel Hilbert space. We show that a difference of matrix-based entropies (DiME) is well suited for problems involving the maximization of mutual information between random variables. While many methods for such tasks can lead to trivial solutions, DiME naturally penalizes such outcomes. We compare DiME to several baseline estimators of mutual information on a toy Gaussian dataset. We provide examples of use cases for DiME, such as latent factor disentanglement and a multiview representation learning problem where DiME is used to learn a shared representation among views with high mutual information.

Related papers

Discrete Bridges for Mutual Information Estimation [48.80678813569798]
We leverage the discrete state space formulation of bridge matching models to address the estimation of the mutual information between discrete random variables.<n>By neatly framing MI estimation as a domain transfer problem, we construct a Discrete Bridge Mutual Information (DBMI) estimator suitable for discrete data.<n>We showcase the performance of our estimator on two MI estimation settings: low-dimensional and image-based.
arXiv Detail & Related papers (2026-02-09T16:55:09Z)
InfoBridge: Mutual Information estimation via Bridge Matching [64.11574776911542]
We show that by using the theory of diffusion bridges, one can construct an unbiased estimator for data posing difficulties for conventional MI estimators. We showcase the performance of our estimator on a series of standard MI estimation benchmarks.
arXiv Detail & Related papers (2025-02-03T14:18:37Z)
Weakly supervised covariance matrices alignment through Stiefel matrices estimation for MEG applications [64.20396555814513]
This paper introduces a novel domain adaptation technique for time series data, called Mixing model Stiefel Adaptation (MSA) We exploit abundant unlabeled data in the target domain to ensure effective prediction by establishing pairwise correspondence with equivalent signal variances between domains. MSA outperforms recent methods in brain-age regression with task variations using magnetoencephalography (MEG) signals from the Cam-CAN dataset.
arXiv Detail & Related papers (2024-01-24T19:04:49Z)
DCID: Deep Canonical Information Decomposition [84.59396326810085]
We consider the problem of identifying the signal shared between two one-dimensional target variables. We propose ICM, an evaluation metric which can be used in the presence of ground-truth labels. We also propose Deep Canonical Information Decomposition (DCID) - a simple, yet effective approach for learning the shared variables.
arXiv Detail & Related papers (2023-06-27T16:59:06Z)
Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data. Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes. Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z)
Rank-one matrix estimation with groupwise heteroskedasticity [5.202966939338455]
We study the problem of estimating a rank-one matrix from Gaussian observations where different blocks of the matrix are observed under different noise levels. We prove exact formulas for the minimum mean-squared error in estimating both the matrix and the latent variables. We derive an approximate message passing algorithm and a gradient descent algorithm and show empirically that these algorithms achieve the information-theoretic limits in certain regimes.
arXiv Detail & Related papers (2021-06-22T17:48:36Z)
Entropy Minimizing Matrix Factorization [102.26446204624885]
Nonnegative Matrix Factorization (NMF) is a widely-used data analysis technique, and has yielded impressive results in many real-world tasks. In this study, an Entropy Minimizing Matrix Factorization framework (EMMF) is developed to tackle the above problem. Considering that the outliers are usually much less than the normal samples, a new entropy loss function is established for matrix factorization.
arXiv Detail & Related papers (2021-03-24T21:08:43Z)
One-shot Distributed Algorithm for Generalized Eigenvalue Problem [23.9525986377055]
Generalized eigenvalue problem (GEP) plays a vital role in a large family of high-dimensional statistical models. Here we propose a general distributed GEP framework with one-shot communication for GEP.
arXiv Detail & Related papers (2020-10-22T11:43:16Z)
Information Theory Measures via Multidimensional Gaussianization [7.788961560607993]
Information theory is an outstanding framework to measure uncertainty, dependence and relevance in data and systems. It has several desirable properties for real world applications. However, obtaining information from multidimensional data is a challenging problem due to the curse of dimensionality.
arXiv Detail & Related papers (2020-10-08T07:22:16Z)
Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets. Part of the challenge of learning robust models lies in the influence of unobserved confounders. We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z)
Information-Theoretic Limits for the Matrix Tensor Product [8.206394018475708]
This paper studies a high-dimensional inference problem involving the matrix tensor product of random matrices. On the technical side, this paper introduces some new techniques for the analysis of high-dimensional matrix-preserving signals.
arXiv Detail & Related papers (2020-05-22T17:03:48Z)
Orthogonal Inductive Matrix Completion [25.03115399173275]
We propose an interpretable approach to matrix completion based on a sum of orthonormal side information terms. We optimize the approach by a provably converging algorithm. We analyse the performance of OMIC on several synthetic and real datasets.
arXiv Detail & Related papers (2020-04-03T16:21:23Z)
Inverse Learning of Symmetries [71.62109774068064]
We learn the symmetry transformation with a model consisting of two latent subspaces. Our approach is based on the deep information bottleneck in combination with a continuous mutual information regulariser. Our model outperforms state-of-the-art methods on artificial and molecular datasets.
arXiv Detail & Related papers (2020-02-07T13:48:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.