The Dual Information Bottleneck
- URL: http://arxiv.org/abs/2006.04641v1
- Date: Mon, 8 Jun 2020 14:43:11 GMT
- Title: The Dual Information Bottleneck
- Authors: Zoe Piran, Ravid Shwartz-Ziv, Naftali Tishby
- Abstract summary: The Information Bottleneck (IB) framework is a general characterization of optimal representations obtained using a principled approach for balancing accuracy and complexity.
We present a new framework, the Dual Information Bottleneck (dualIB) which resolves some of the known drawbacks of the IB.
- Score: 1.6559345531428509
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Information Bottleneck (IB) framework is a general characterization of
optimal representations obtained using a principled approach for balancing
accuracy and complexity. Here we present a new framework, the Dual Information
Bottleneck (dualIB), which resolves some of the known drawbacks of the IB. We
provide a theoretical analysis of the dualIB framework; (i) solving for the
structure of its solutions (ii) unraveling its superiority in optimizing the
mean prediction error exponent and (iii) demonstrating its ability to preserve
exponential forms of the original distribution. To approach large scale
problems, we present a novel variational formulation of the dualIB for Deep
Neural Networks. In experiments on several data-sets, we compare it to a
variational form of the IB. This exposes superior Information Plane properties
of the dualIB and its potential in improvement of the error.
Related papers
- On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning [85.75164588939185]
We study the discriminative probabilistic modeling problem on a continuous domain for (multimodal) self-supervised representation learning.
We conduct generalization error analysis to reveal the limitation of current InfoNCE-based contrastive loss for self-supervised representation learning.
arXiv Detail & Related papers (2024-10-11T18:02:46Z) - Differentiable Information Bottleneck for Deterministic Multi-view Clustering [9.723389925212567]
We propose a new differentiable information bottleneck (DIB) method, which provides a deterministic and analytical MVC solution.
Specifically, we first propose to directly fit the mutual information of high-dimensional spaces by leveraging normalized kernel Gram matrix.
Then, based on the new mutual information measurement, a deterministic multi-view neural network with analytical gradients is explicitly trained to parameterize IB principle.
arXiv Detail & Related papers (2024-03-23T02:13:22Z) - Tighter Bounds on the Information Bottleneck with Application to Deep
Learning [6.206127662604578]
Deep Neural Nets (DNNs) learn latent representations induced by their downstream task, objective function, and other parameters.
The Information Bottleneck (IB) provides a hypothetically optimal framework for data modeling, yet it is often intractable.
Recent efforts combined DNNs with the IB by applying VAE-inspired variational methods to approximate bounds on mutual information, resulting in improved robustness to adversarial attacks.
arXiv Detail & Related papers (2024-02-12T13:24:32Z) - Elastic Information Bottleneck [34.90040361806197]
Information bottleneck is an information-theoretic principle of representation learning.
We propose an elastic information bottleneck (EIB) to interpolate between the IB and DIB regularizers.
simulations and real data experiments show that EIB has the ability to achieve better domain adaptation results than IB and DIB.
arXiv Detail & Related papers (2023-11-07T12:53:55Z) - Disentangled Representation Learning with Transmitted Information Bottleneck [57.22757813140418]
We present textbfDisTIB (textbfTransmitted textbfInformation textbfBottleneck for textbfDisd representation learning), a novel objective that navigates the balance between information compression and preservation.
arXiv Detail & Related papers (2023-11-03T03:18:40Z) - Generalized Information Bottleneck for Gaussian Variables [6.700873164609009]
We derive an exact analytical IB solution for the case of Gaussian correlated variables.
We find that although solving the original, Renyi and Jeffreys IB problems yields different representations in general, the structural transitions occur at the same critical tradeoff parameters.
arXiv Detail & Related papers (2023-03-31T01:38:26Z) - Efficient Alternating Minimization Solvers for Wyner Multi-View
Unsupervised Learning [0.0]
We propose two novel formulations that enable the development of computational efficient solvers based the alternating principle.
The proposed solvers offer computational efficiency, theoretical convergence guarantees, local minima complexity with the number of views, and exceptional accuracy as compared with the state-of-the-art techniques.
arXiv Detail & Related papers (2023-03-28T10:17:51Z) - Variational Distillation for Multi-View Learning [104.17551354374821]
We design several variational information bottlenecks to exploit two key characteristics for multi-view representation learning.
Under rigorously theoretical guarantee, our approach enables IB to grasp the intrinsic correlation between observations and semantic labels.
arXiv Detail & Related papers (2022-06-20T03:09:46Z) - Handling Distribution Shifts on Graphs: An Invariance Perspective [78.31180235269035]
We formulate the OOD problem on graphs and develop a new invariant learning approach, Explore-to-Extrapolate Risk Minimization (EERM)
EERM resorts to multiple context explorers that are adversarially trained to maximize the variance of risks from multiple virtual environments.
We prove the validity of our method by theoretically showing its guarantee of a valid OOD solution.
arXiv Detail & Related papers (2022-02-05T02:31:01Z) - BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery [97.79015388276483]
A structural equation model (SEM) is an effective framework to reason over causal relationships represented via a directed acyclic graph (DAG)
Recent advances enabled effective maximum-likelihood point estimation of DAGs from observational data.
We propose BCD Nets, a variational framework for estimating a distribution over DAGs characterizing a linear-Gaussian SEM.
arXiv Detail & Related papers (2021-12-06T03:35:21Z) - On the Difference Between the Information Bottleneck and the Deep
Information Bottleneck [81.89141311906552]
We revisit the Deep Variational Information Bottleneck and the assumptions needed for its derivation.
We show how to circumvent this limitation by optimising a lower bound for $I(T;Y)$ for which only the latter Markov chain has to be satisfied.
arXiv Detail & Related papers (2019-12-31T18:31:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.