On the Relevance-Complexity Region of Scalable Information Bottleneck
- URL: http://arxiv.org/abs/2011.01352v1
- Date: Mon, 2 Nov 2020 22:25:28 GMT
- Title: On the Relevance-Complexity Region of Scalable Information Bottleneck
- Authors: Mohammad Mahdi Mahvari, Mari Kobayashi, Abdellatif Zaidi
- Abstract summary: We study a variation of the problem, called scalable information bottleneck, where the encoder outputs multiple descriptions of the observation.
The problem at hand is motivated by some application scenarios that require varying levels of accuracy depending on the allowed level of generalization.
- Score: 15.314757778110955
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Information Bottleneck method is a learning technique that seeks a right
balance between accuracy and generalization capability through a suitable
tradeoff between compression complexity, measured by minimum description
length, and distortion evaluated under logarithmic loss measure. In this paper,
we study a variation of the problem, called scalable information bottleneck,
where the encoder outputs multiple descriptions of the observation with
increasingly richer features. The problem at hand is motivated by some
application scenarios that require varying levels of accuracy depending on the
allowed level of generalization. First, we establish explicit (analytic)
characterizations of the relevance-complexity region for memoryless Gaussian
sources and memoryless binary sources. Then, we derive a Blahut-Arimoto type
algorithm that allows us to compute (an approximation of) the region for
general discrete sources. Finally, an application example in the pattern
classification problem is provided along with numerical results.
Related papers
- Optimal Algorithms for the Inhomogeneous Spiked Wigner Model [89.1371983413931]
We derive an approximate message-passing algorithm (AMP) for the inhomogeneous problem.
We identify in particular the existence of a statistical-to-computational gap where known algorithms require a signal-to-noise ratio bigger than the information-theoretic threshold to perform better than random.
arXiv Detail & Related papers (2023-02-13T19:57:17Z) - Information Theoretic Lower Bounds for Information Theoretic Upper
Bounds [14.268363583731848]
We examine the relationship between the output model and the empirical generalization and the algorithm in the context of convex optimization.
Our study reveals that, for true risk minimization, mutual information is necessary.
Existing information-theoretic generalization bounds fall short in capturing the capabilities of algorithms like SGD and regularized, which have-independent dimension sample complexity.
arXiv Detail & Related papers (2023-02-09T20:42:36Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Large Graph Signal Denoising with Application to Differential Privacy [2.867517731896504]
We consider the case of signal denoising on graphs via a data-driven wavelet tight frame methodology.
We make it scalable to large graphs using Chebyshev-Jackson approximations.
A comprehensive performance analysis is carried out on graphs of varying size, from real and simulated data.
arXiv Detail & Related papers (2022-09-05T16:32:54Z) - Learning Optical Flow from a Few Matches [67.83633948984954]
We show that the dense correlation volume representation is redundant and accurate flow estimation can be achieved with only a fraction of elements in it.
Experiments show that our method can reduce computational cost and memory use significantly, while maintaining high accuracy.
arXiv Detail & Related papers (2021-04-05T21:44:00Z) - Scalable Vector Gaussian Information Bottleneck [19.21005180893519]
We study a variation of the problem, called scalable information bottleneck, in which the encoder outputs multiple descriptions of the observation.
We derive a variational inference type algorithm for general sources with unknown distribution; and show means of parametrizing it using neural networks.
arXiv Detail & Related papers (2021-02-15T12:51:26Z) - Fundamental Limits and Tradeoffs in Invariant Representation Learning [99.2368462915979]
Many machine learning applications involve learning representations that achieve two competing goals.
Minimax game-theoretic formulation represents a fundamental tradeoff between accuracy and invariance.
We provide an information-theoretic analysis of this general and important problem under both classification and regression settings.
arXiv Detail & Related papers (2020-12-19T15:24:04Z) - Information Theoretic Meta Learning with Gaussian Processes [74.54485310507336]
We formulate meta learning using information theoretic concepts; namely, mutual information and the information bottleneck.
By making use of variational approximations to the mutual information, we derive a general and tractable framework for meta learning.
arXiv Detail & Related papers (2020-09-07T16:47:30Z) - Information-theoretic analysis for transfer learning [5.081241420920605]
We give an information-theoretic analysis on the generalization error and the excess risk of transfer learning algorithms.
Our results suggest, perhaps as expected, that the Kullback-Leibler divergence $D(mu||mu')$ plays an important role in characterizing the generalization error.
arXiv Detail & Related papers (2020-05-18T13:23:20Z) - On the Information Bottleneck Problems: Models, Connections,
Applications and Information Theoretic Views [39.49498500593645]
This tutorial paper focuses on the variants of the bottleneck problem taking an information theoretic perspective.
It discusses practical methods to solve it, as well as its connection to coding and learning aspects.
arXiv Detail & Related papers (2020-01-31T15:23:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.