Variational $f$-Divergence and Derangements for Discriminative Mutual
Information Estimation
- URL: http://arxiv.org/abs/2305.20025v1
- Date: Wed, 31 May 2023 16:54:25 GMT
- Title: Variational $f$-Divergence and Derangements for Discriminative Mutual
Information Estimation
- Authors: Nunzio A. Letizia, Nicola Novello, Andrea M. Tonello
- Abstract summary: We propose a novel class of discriminative mutual information estimators based on the variational representation of the $f$-divergence.
Experiments on reference scenarios demonstrate that our approach outperforms state-of-the-art neural estimators both in terms of accuracy and complexity.
- Score: 4.114444605090134
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The accurate estimation of the mutual information is a crucial task in
various applications, including machine learning, communications, and biology,
since it enables the understanding of complex systems. High-dimensional data
render the task extremely challenging due to the amount of data to be processed
and the presence of convoluted patterns. Neural estimators based on variational
lower bounds of the mutual information have gained attention in recent years
but they are prone to either high bias or high variance as a consequence of the
partition function. We propose a novel class of discriminative mutual
information estimators based on the variational representation of the
$f$-divergence. We investigate the impact of the permutation function used to
obtain the marginal training samples and present a novel architectural solution
based on derangements. The proposed estimator is flexible as it exhibits an
excellent bias/variance trade-off. Experiments on reference scenarios
demonstrate that our approach outperforms state-of-the-art neural estimators
both in terms of accuracy and complexity.
Related papers
- DiffusionCounterfactuals: Inferring High-dimensional Counterfactuals with Guidance of Causal Representations [18.973047393598346]
We propose a novel framework that incorporates causal mechanisms and diffusion models to generate high-quality counterfactual samples.
Our approach introduces a novel, theoretically grounded training and sampling process that enables the model to consistently generate accurate counterfactual high-dimensional data.
arXiv Detail & Related papers (2024-07-30T05:15:19Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - REMEDI: Corrective Transformations for Improved Neural Entropy Estimation [0.7488108981865708]
We introduce $textttREMEDI$ for efficient and accurate estimation of differential entropy.
Our approach demonstrates improvement across a broad spectrum of estimation tasks.
It can be naturally extended to information theoretic supervised learning models.
arXiv Detail & Related papers (2024-02-08T14:47:37Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Generalizable Information Theoretic Causal Representation [37.54158138447033]
We propose to learn causal representation from observational data by regularizing the learning procedure with mutual information measures according to our hypothetical causal graph.
The optimization involves a counterfactual loss, based on which we deduce a theoretical guarantee that the causality-inspired learning is with reduced sample complexity and better generalization ability.
arXiv Detail & Related papers (2022-02-17T00:38:35Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Information Theory Measures via Multidimensional Gaussianization [7.788961560607993]
Information theory is an outstanding framework to measure uncertainty, dependence and relevance in data and systems.
It has several desirable properties for real world applications.
However, obtaining information from multidimensional data is a challenging problem due to the curse of dimensionality.
arXiv Detail & Related papers (2020-10-08T07:22:16Z) - DEMI: Discriminative Estimator of Mutual Information [5.248805627195347]
Estimating mutual information between continuous random variables is often intractable and challenging for high-dimensional data.
Recent progress has leveraged neural networks to optimize variational lower bounds on mutual information.
Our approach is based on training a classifier that provides the probability that a data sample pair is drawn from the joint distribution.
arXiv Detail & Related papers (2020-10-05T04:19:27Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z) - Precise Tradeoffs in Adversarial Training for Linear Regression [55.764306209771405]
We provide a precise and comprehensive understanding of the role of adversarial training in the context of linear regression with Gaussian features.
We precisely characterize the standard/robust accuracy and the corresponding tradeoff achieved by a contemporary mini-max adversarial training approach.
Our theory for adversarial training algorithms also facilitates the rigorous study of how a variety of factors (size and quality of training data, model overparametrization etc.) affect the tradeoff between these two competing accuracies.
arXiv Detail & Related papers (2020-02-24T19:01:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.