INFO-SEDD: Continuous Time Markov Chains as Scalable Information Metrics Estimators
- URL: http://arxiv.org/abs/2502.19183v2
- Date: Thu, 27 Feb 2025 10:15:28 GMT
- Title: INFO-SEDD: Continuous Time Markov Chains as Scalable Information Metrics Estimators
- Authors: Alberto Foresti, Giulio Franzese, Pietro Michiardi,
- Abstract summary: INFO-SEDD is a novel method for estimating information-theoretic quantities of discrete data, including mutual information and entropy.<n>Our approach requires the training of a single parametric model, offering significant computational and memory advantages.<n>Our experiments demonstrate that INFO-SEDD is robust and outperforms neural competitors that rely on embedding techniques.
- Score: 7.399561232927219
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Information-theoretic quantities play a crucial role in understanding non-linear relationships between random variables and are widely used across scientific disciplines. However, estimating these quantities remains an open problem, particularly in the case of high-dimensional discrete distributions. Current approaches typically rely on embedding discrete data into a continuous space and applying neural estimators originally designed for continuous distributions, a process that may not fully capture the discrete nature of the underlying data. We consider Continuous-Time Markov Chains (CTMCs), stochastic processes on discrete state-spaces which have gained popularity due to their generative modeling applications. In this work, we introduce INFO-SEDD, a novel method for estimating information-theoretic quantities of discrete data, including mutual information and entropy. Our approach requires the training of a single parametric model, offering significant computational and memory advantages. Additionally, it seamlessly integrates with pretrained networks, allowing for efficient reuse of pretrained generative models. To evaluate our approach, we construct a challenging synthetic benchmark. Our experiments demonstrate that INFO-SEDD is robust and outperforms neural competitors that rely on embedding techniques. Moreover, we validate our method on a real-world task: estimating the entropy of an Ising model. Overall, INFO-SEDD outperforms competing methods and shows scalability to high-dimensional scenarios, paving the way for new applications where estimating MI between discrete distribution is the focus. The promising results in this complex, high-dimensional scenario highlight INFO-SEDD as a powerful new estimator in the toolkit for information-theoretical analysis.
Related papers
- Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - REMEDI: Corrective Transformations for Improved Neural Entropy Estimation [0.7488108981865708]
We introduce $textttREMEDI$ for efficient and accurate estimation of differential entropy.
Our approach demonstrates improvement across a broad spectrum of estimation tasks.
It can be naturally extended to information theoretic supervised learning models.
arXiv Detail & Related papers (2024-02-08T14:47:37Z) - Online Tensor Inference [0.0]
Traditional offline learning, involving the storage and utilization of all data in each computational iteration, becomes impractical for high-dimensional tensor data.
Existing low-rank tensor methods lack the capability for statistical inference in an online fashion.
Our approach employs Gradient Descent (SGD) to enable efficient real-time data processing without extensive memory requirements.
arXiv Detail & Related papers (2023-12-28T16:37:48Z) - Deep Ensembles Meets Quantile Regression: Uncertainty-aware Imputation for Time Series [45.76310830281876]
We propose Quantile Sub-Ensembles, a novel method to estimate uncertainty with ensemble of quantile-regression-based task networks.
Our method not only produces accurate imputations that is robust to high missing rates, but also is computationally efficient due to the fast training of its non-generative model.
arXiv Detail & Related papers (2023-12-03T05:52:30Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data.
Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z) - STEERING: Stein Information Directed Exploration for Model-Based
Reinforcement Learning [111.75423966239092]
We propose an exploration incentive in terms of the integral probability metric (IPM) between a current estimate of the transition model and the unknown optimal.
Based on KSD, we develop a novel algorithm algo: textbfSTEin information dirtextbfEcted exploration for model-based textbfReinforcement LearntextbfING.
arXiv Detail & Related papers (2023-01-28T00:49:28Z) - Infinite-Fidelity Coregionalization for Physical Simulation [22.524773932668023]
Multi-fidelity modeling and learning are important in physical simulation-related applications.
We propose Infinite Fidelity Coregionalization (IFC) to exploit rich information within continuous, infinite fidelities.
We show the advantage of our method in several benchmark tasks in computational physics.
arXiv Detail & Related papers (2022-07-01T23:01:10Z) - PCENet: High Dimensional Surrogate Modeling for Learning Uncertainty [15.781915567005251]
We present a novel surrogate model for representation learning and uncertainty quantification.
The proposed model combines a neural network approach for dimensionality reduction of the (potentially high-dimensional) data, with a surrogate model method for learning the data distribution.
Our model enables us to (a) learn a representation of the data, (b) estimate uncertainty in the high-dimensional data system, and (c) match high order moments of the output distribution.
arXiv Detail & Related papers (2022-02-10T14:42:51Z) - Deep Efficient Continuous Manifold Learning for Time Series Modeling [11.876985348588477]
A symmetric positive definite matrix is being studied in computer vision, signal processing, and medical image analysis.
In this paper, we propose a framework to exploit a diffeomorphism mapping between Riemannian manifold and a Cholesky space.
For dynamic modeling of time-series data, we devise a continuous manifold learning method by systematically integrating a manifold ordinary differential equation and a gated recurrent neural network.
arXiv Detail & Related papers (2021-12-03T01:38:38Z) - Efficient training of lightweight neural networks using Online
Self-Acquired Knowledge Distillation [51.66271681532262]
Online Self-Acquired Knowledge Distillation (OSAKD) is proposed, aiming to improve the performance of any deep neural model in an online manner.
We utilize k-nn non-parametric density estimation technique for estimating the unknown probability distributions of the data samples in the output feature space.
arXiv Detail & Related papers (2021-08-26T14:01:04Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z) - Towards Handling Uncertainty-at-Source in AI -- A Review and Next Steps
for Interval Regression [6.166295570030645]
This paper focuses on linear regression for interval-valued data as a recent growth area.
We conduct an in-depth analysis of state-of-the-art methods, elucidating their behaviour, advantages, and pitfalls when applied to datasets with different properties.
arXiv Detail & Related papers (2021-04-15T05:31:10Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.