InfoBridge: Mutual Information estimation via Bridge Matching
- URL: http://arxiv.org/abs/2502.01383v2
- Date: Mon, 26 May 2025 15:35:24 GMT
- Title: InfoBridge: Mutual Information estimation via Bridge Matching
- Authors: Sergei Kholkin, Ivan Butakov, Evgeny Burnaev, Nikita Gushchin, Alexander Korotin,
- Abstract summary: We show that by using the theory of diffusion bridges, one can construct an unbiased estimator for data posing difficulties for conventional MI estimators.<n>We showcase the performance of our estimator on two standard MI estimation benchmarks, i.e., low-dimensional and image-based, and on real-world data.
- Score: 64.11574776911542
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion bridge models have recently become a powerful tool in the field of generative modeling. In this work, we leverage their power to address another important problem in machine learning and information theory, the estimation of the mutual information (MI) between two random variables. We show that by using the theory of diffusion bridges, one can construct an unbiased estimator for data posing difficulties for conventional MI estimators. We showcase the performance of our estimator on two standard MI estimation benchmarks, i.e., low-dimensional and image-based, and on real-world data, i.e., protein language model embeddings.
Related papers
- A Neural Difference-of-Entropies Estimator for Mutual Information [2.3020018305241337]
We propose a novel mutual information estimator based on parametrizing conditional densities using normalizing flows.
This estimator leverages a block autoregressive structure to achieve improved bias-variance trade-offs on standard benchmark tasks.
arXiv Detail & Related papers (2025-02-18T17:48:25Z) - Mutual Information Multinomial Estimation [53.58005108981247]
Estimating mutual information (MI) is a fundamental yet challenging task in data science and machine learning.
Our main discovery is that a preliminary estimate of the data distribution can dramatically help estimate.
Experiments on diverse tasks including non-Gaussian synthetic problems with known ground-truth and real-world applications demonstrate the advantages of our method.
arXiv Detail & Related papers (2024-08-18T06:27:30Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Mutual Information Estimation via Normalizing Flows [39.58317527488534]
We propose a novel approach to the problem of mutual information estimation.
The estimator maps original data to the target distribution, for which MI is easier to estimate.
We additionally explore the target distributions with known closed-form expressions for MI.
arXiv Detail & Related papers (2024-03-04T16:28:04Z) - A Two-Scale Complexity Measure for Deep Learning Models [2.512406961007489]
We introduce a novel capacity measure 2sED for statistical models based on the effective dimension.<n>The new quantity provably bounds the generalization error under mild assumptions on the model.<n> simulations on standard data sets and popular model architectures show that 2sED correlates well with the training error.
arXiv Detail & Related papers (2024-01-17T12:50:50Z) - Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
We study bias arising from confounders in a causal graph for multimodal data.
Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data.
We use these features as confounder representations and use them via methods motivated by causal theory to remove bias from models.
arXiv Detail & Related papers (2023-11-28T16:46:14Z) - Time Series Anomaly Detection using Diffusion-based Models [5.896413260185387]
Diffusion models have been recently used for anomaly detection in images.
We investigate whether they can also be leveraged for anomaly detection on multivariate time series.
Our models outperform the baselines on synthetic datasets and are competitive on real-world datasets.
arXiv Detail & Related papers (2023-11-02T17:58:09Z) - Diff-Instruct: A Universal Approach for Transferring Knowledge From
Pre-trained Diffusion Models [77.83923746319498]
We propose a framework called Diff-Instruct to instruct the training of arbitrary generative models.
We show that Diff-Instruct results in state-of-the-art single-step diffusion-based models.
Experiments on refining GAN models show that the Diff-Instruct can consistently improve the pre-trained generators of GAN models.
arXiv Detail & Related papers (2023-05-29T04:22:57Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - Think Twice: Measuring the Efficiency of Eliminating Prediction
Shortcuts of Question Answering Models [3.9052860539161918]
We propose a simple method for measuring a scale of models' reliance on any identified spurious feature.
We assess the robustness towards a large set of known and newly found prediction biases for various pre-trained models and debiasing methods in Question Answering (QA)
We find that while existing debiasing methods can mitigate reliance on a chosen spurious feature, the OOD performance gains of these methods can not be explained by mitigated reliance on biased features.
arXiv Detail & Related papers (2023-05-11T14:35:00Z) - Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost.
Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z) - Fitting a Directional Microstructure Model to Diffusion-Relaxation MRI
Data with Self-Supervised Machine Learning [2.8167227950959206]
Self-supervised machine learning is emerging as an attractive alternative to supervised learning.
In this paper, we demonstrate self-supervised machine learning model fitting for a directional microstructural model.
Our approach shows clear improvements in parameter estimation and computational time, compared to standard non-linear least squares fitting.
arXiv Detail & Related papers (2022-10-05T15:51:39Z) - Bottlenecks CLUB: Unifying Information-Theoretic Trade-offs Among
Complexity, Leakage, and Utility [8.782250973555026]
Bottleneck problems are an important class of optimization problems that have recently gained increasing attention in the domain of machine learning and information theory.
We propose a general family of optimization problems, termed as complexity-leakage-utility bottleneck (CLUB) model.
We show that the CLUB model generalizes all these problems as well as most other information-theoretic privacy models.
arXiv Detail & Related papers (2022-07-11T14:07:48Z) - ER: Equivariance Regularizer for Knowledge Graph Completion [107.51609402963072]
We propose a new regularizer, namely, Equivariance Regularizer (ER)
ER can enhance the generalization ability of the model by employing the semantic equivariance between the head and tail entities.
The experimental results indicate a clear and substantial improvement over the state-of-the-art relation prediction methods.
arXiv Detail & Related papers (2022-06-24T08:18:05Z) - Calibrating Agent-based Models to Microdata with Graph Neural Networks [1.4911092205861822]
Calibrating agent-based models (ABMs) to data is among the most fundamental requirements to ensure the model fulfils its desired purpose.
We propose to learn parameter posteriors associated with granular microdata directly using temporal graph neural networks.
arXiv Detail & Related papers (2022-06-15T14:41:43Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - The Conditional Entropy Bottleneck [8.797368310561058]
We characterize failures of robust generalization as failures of accuracy or related metrics on a held-out set.
We propose the Minimum Necessary Information (MNI) criterion for evaluating the quality of a model.
In order to train models that perform well with respect to the MNI criterion, we present a new objective function, the Conditional Entropy Bottleneck (CEB)
We experimentally test our hypothesis by comparing the performance of CEB models with deterministic models and Variational Information Bottleneck (VIB) models on a variety of different datasets.
arXiv Detail & Related papers (2020-02-13T07:46:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.