On the Information Bottleneck Problems: Models, Connections,
Applications and Information Theoretic Views
- URL: http://arxiv.org/abs/2002.00008v1
- Date: Fri, 31 Jan 2020 15:23:19 GMT
- Title: On the Information Bottleneck Problems: Models, Connections,
Applications and Information Theoretic Views
- Authors: Abdellatif Zaidi, Inaki Estella Aguerri and Shlomo Shamai (Shitz)
- Abstract summary: This tutorial paper focuses on the variants of the bottleneck problem taking an information theoretic perspective.
It discusses practical methods to solve it, as well as its connection to coding and learning aspects.
- Score: 39.49498500593645
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This tutorial paper focuses on the variants of the bottleneck problem taking
an information theoretic perspective and discusses practical methods to solve
it, as well as its connection to coding and learning aspects. The intimate
connections of this setting to remote source-coding under logarithmic loss
distortion measure, information combining, common reconstruction, the
Wyner-Ahlswede-Korner problem, the efficiency of investment information, as
well as, generalization, variational inference, representation learning,
autoencoders, and others are highlighted. We discuss its extension to the
distributed information bottleneck problem with emphasis on the Gaussian model
and highlight the basic connections to the uplink Cloud Radio Access Networks
(CRAN) with oblivious processing. For this model, the optimal trade-offs
between relevance (i.e., information) and complexity (i.e., rates) in the
discrete and vector Gaussian frameworks is determined. In the concluding
outlook, some interesting problems are mentioned such as the characterization
of the optimal inputs ("features") distributions under power limitations
maximizing the "relevance" for the Gaussian information bottleneck, under
"complexity" constraints.
Related papers
- Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - Analysis and Optimization of Wireless Federated Learning with Data
Heterogeneity [72.85248553787538]
This paper focuses on performance analysis and optimization for wireless FL, considering data heterogeneity, combined with wireless resource allocation.
We formulate the loss function minimization problem, under constraints on long-term energy consumption and latency, and jointly optimize client scheduling, resource allocation, and the number of local training epochs (CRE)
Experiments on real-world datasets demonstrate that the proposed algorithm outperforms other benchmarks in terms of the learning accuracy and energy consumption.
arXiv Detail & Related papers (2023-08-04T04:18:01Z) - Discrete Key-Value Bottleneck [95.61236311369821]
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant.
One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning.
Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks.
We propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes.
arXiv Detail & Related papers (2022-07-22T17:52:30Z) - Scalable Vector Gaussian Information Bottleneck [19.21005180893519]
We study a variation of the problem, called scalable information bottleneck, in which the encoder outputs multiple descriptions of the observation.
We derive a variational inference type algorithm for general sources with unknown distribution; and show means of parametrizing it using neural networks.
arXiv Detail & Related papers (2021-02-15T12:51:26Z) - Bottleneck Problems: Information and Estimation-Theoretic View [2.7793394375935088]
Information bottleneck (IB) and privacy funnel (PF) are two closely related optimization problems.
We show how to evaluate bottleneck problems in closed form by equivalently expressing them in terms of lower envelope or upper envelope of certain functions.
arXiv Detail & Related papers (2020-11-12T05:16:44Z) - On the Relevance-Complexity Region of Scalable Information Bottleneck [15.314757778110955]
We study a variation of the problem, called scalable information bottleneck, where the encoder outputs multiple descriptions of the observation.
The problem at hand is motivated by some application scenarios that require varying levels of accuracy depending on the allowed level of generalization.
arXiv Detail & Related papers (2020-11-02T22:25:28Z) - Focus of Attention Improves Information Transfer in Visual Features [80.22965663534556]
This paper focuses on unsupervised learning for transferring visual information in a truly online setting.
The computation of the entropy terms is carried out by a temporal process which yields online estimation of the entropy terms.
In order to better structure the input probability distribution, we use a human-like focus of attention model.
arXiv Detail & Related papers (2020-06-16T15:07:25Z) - Scaling-up Distributed Processing of Data Streams for Machine Learning [10.581140430698103]
This paper reviews recently developed methods that focus on large-scale distributed optimization in the compute- and bandwidth-limited regime.
It focuses on methods that solve: (i) distributed convex problems, and (ii) distributed principal component analysis, which is a non problem with geometric structure that permits global convergence.
arXiv Detail & Related papers (2020-05-18T16:28:54Z) - Distributed Learning in the Non-Convex World: From Batch to Streaming
Data, and Beyond [73.03743482037378]
Distributed learning has become a critical direction of the massively connected world envisioned by many.
This article discusses four key elements of scalable distributed processing and real-time data computation problems.
Practical issues and future research will also be discussed.
arXiv Detail & Related papers (2020-01-14T14:11:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.