Neural Mutual Information Estimation with Vector Copulas
- URL: http://arxiv.org/abs/2510.20968v1
- Date: Thu, 23 Oct 2025 19:54:56 GMT
- Title: Neural Mutual Information Estimation with Vector Copulas
- Authors: Yanzhi Chen, Zijing Ou, Adrian Weller, Michael U. Gutmann,
- Abstract summary: Estimating mutual information (MI) is a fundamental task in data science and machine learning.<n>We propose a principled between these two extremes to achieve a better trade-off between complexity and capacity.
- Score: 42.48277336237606
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating mutual information (MI) is a fundamental task in data science and machine learning. Existing estimators mainly rely on either highly flexible models (e.g., neural networks), which require large amounts of data, or overly simplified models (e.g., Gaussian copula), which fail to capture complex distributions. Drawing upon recent vector copula theory, we propose a principled interpolation between these two extremes to achieve a better trade-off between complexity and capacity. Experiments on state-of-the-art synthetic benchmarks and real-world data with diverse modalities demonstrate the advantages of the proposed estimator.
Related papers
- Information-theoretic Quantification of High-order Feature Effects in Classification Problems [0.19791587637442676]
We present an information-theoretic extension of the High-order interactions for Feature importance (Hi-Fi) method.<n>Our framework decomposes feature contributions into unique, synergistic, and redundant components.<n>Results indicate that the proposed estimator accurately recovers theoretical and expected findings.
arXiv Detail & Related papers (2025-07-06T11:50:30Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Surprisal Driven $k$-NN for Robust and Interpretable Nonparametric
Learning [1.4293924404819704]
We shed new light on the traditional nearest neighbors algorithm from the perspective of information theory.
We propose a robust and interpretable framework for tasks such as classification, regression, density estimation, and anomaly detection using a single model.
Our work showcases the architecture's versatility by achieving state-of-the-art results in classification and anomaly detection.
arXiv Detail & Related papers (2023-11-17T00:35:38Z) - Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data.
Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z) - Mutual Information Estimation via $f$-Divergence and Data Derangements [6.43826005042477]
We propose a novel class of discrimi mutual information estimators based on the variational representation of the $f$-divergence.
The proposed estimator is flexible since it exhibits an excellent bias/ variance trade-off.
arXiv Detail & Related papers (2023-05-31T16:54:25Z) - h-analysis and data-parallel physics-informed neural networks [0.7614628596146599]
We explore the data-parallel acceleration of machine learning schemes with a focus on physics-informed neural networks (PINNs)
We detail a novel protocol based on $h$-analysis and data-parallel acceleration through the Horovod training framework.
We show that the acceleration is straightforward to implement, does not compromise training, and proves to be highly efficient and controllable.
arXiv Detail & Related papers (2023-02-17T12:15:18Z) - DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained
Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states.
We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs.
Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - An Information-Theoretic Framework for Supervised Learning [22.280001450122175]
We propose a novel information-theoretic framework with its own notions of regret and sample complexity.
We study the sample complexity of learning from data generated by deep neural networks with ReLU activation units.
We conclude by corroborating our theoretical results with experimental analysis of random single-hidden-layer neural networks.
arXiv Detail & Related papers (2022-03-01T05:58:28Z) - Extrapolatable Relational Reasoning With Comparators in Low-Dimensional
Manifolds [7.769102711230249]
We propose a neuroscience-inspired inductive-biased module that can be readily amalgamated with current neural network architectures.
We show that neural nets with this inductive bias achieve considerably better o.o.d generalisation performance for a range of relational reasoning tasks.
arXiv Detail & Related papers (2020-06-15T19:09:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.