Information Plane Analysis Visualization in Deep Learning via Transfer Entropy
- URL: http://arxiv.org/abs/2404.01364v1
- Date: Mon, 1 Apr 2024 17:34:18 GMT
- Title: Information Plane Analysis Visualization in Deep Learning via Transfer Entropy
- Authors: Adrian Moldovan, Angel Cataron, Razvan Andonie,
- Abstract summary: In a feedforward network, Transfer Entropy can be used to measure the influence that one layer has on another.
In contrast to mutual information, TE can capture temporal relationships between variables.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In a feedforward network, Transfer Entropy (TE) can be used to measure the influence that one layer has on another by quantifying the information transfer between them during training. According to the Information Bottleneck principle, a neural model's internal representation should compress the input data as much as possible while still retaining sufficient information about the output. Information Plane analysis is a visualization technique used to understand the trade-off between compression and information preservation in the context of the Information Bottleneck method by plotting the amount of information in the input data against the compressed representation. The claim that there is a causal link between information-theoretic compression and generalization, measured by mutual information, is plausible, but results from different studies are conflicting. In contrast to mutual information, TE can capture temporal relationships between variables. To explore such links, in our novel approach we use TE to quantify information transfer between neural layers and perform Information Plane analysis. We obtained encouraging experimental results, opening the possibility for further investigations.
Related papers
- Enhancing Neural Network Interpretability Through Conductance-Based Information Plane Analysis [0.0]
The Information Plane is a conceptual framework used to analyze the flow of information in neural networks.
This paper introduces a new approach that uses layer conductance, a measure of sensitivity to input features, to enhance the Information Plane analysis.
arXiv Detail & Related papers (2024-08-26T23:10:42Z) - TexShape: Information Theoretic Sentence Embedding for Language Models [5.265661844206274]
This paper addresses challenges regarding encoding sentences to their optimized representations through the lens of information-theory.
We use empirical estimates of mutual information, using the Donsker-Varadhan definition of Kullback-Leibler divergence.
Our experiments demonstrate significant advancements in preserving maximal targeted information and minimal sensitive information over adverse compression ratios.
arXiv Detail & Related papers (2024-02-05T22:48:28Z) - Assessing Neural Network Representations During Training Using
Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process.
We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures.
We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z) - An information-Theoretic Approach to Semi-supervised Transfer Learning [33.89602092349131]
Transfer learning allows propagating information from one "source dataset" to another "target dataset"
discrepancies between the underlying distributions of the source and target data are commonplace.
We suggest novel information-theoretic approaches for the analysis of the performance of deep neural networks in the context of transfer learning.
arXiv Detail & Related papers (2023-06-11T17:45:46Z) - Decomposing neural networks as mappings of correlation functions [57.52754806616669]
We study the mapping between probability distributions implemented by a deep feed-forward network.
We identify essential statistics in the data, as well as different information representations that can be used by neural networks.
arXiv Detail & Related papers (2022-02-10T09:30:31Z) - A Bayesian Framework for Information-Theoretic Probing [51.98576673620385]
We argue that probing should be seen as approximating a mutual information.
This led to the rather unintuitive conclusion that representations encode exactly the same information about a target task as the original sentences.
This paper proposes a new framework to measure what we term Bayesian mutual information.
arXiv Detail & Related papers (2021-09-08T18:08:36Z) - Learning in Feedforward Neural Networks Accelerated by Transfer Entropy [0.0]
The transfer entropy (TE) was initially introduced as an information transfer measure used to quantify the statistical coherence between events (time series)
Our contribution is an information-theoretical method for analyzing information transfer between the nodes of feedforward neural networks.
We introduce a backpropagation type training algorithm that uses TE feedback connections to improve its performance.
arXiv Detail & Related papers (2021-04-29T19:07:07Z) - Focus of Attention Improves Information Transfer in Visual Features [80.22965663534556]
This paper focuses on unsupervised learning for transferring visual information in a truly online setting.
The computation of the entropy terms is carried out by a temporal process which yields online estimation of the entropy terms.
In order to better structure the input probability distribution, we use a human-like focus of attention model.
arXiv Detail & Related papers (2020-06-16T15:07:25Z) - On Information Plane Analyses of Neural Network Classifiers -- A Review [7.804994311050265]
We show that compression visualized in information planes is not necessarily information-theoretic.
We argue that even in feed-forward neural networks the data processing inequality need not hold for estimates of mutual information.
arXiv Detail & Related papers (2020-03-21T14:43:45Z) - Forgetting Outside the Box: Scrubbing Deep Networks of Information
Accessible from Input-Output Observations [143.3053365553897]
We describe a procedure for removing dependency on a cohort of training data from a trained deep network.
We introduce a new bound on how much information can be extracted per query about the forgotten cohort.
We exploit the connections between the activation and weight dynamics of a DNN inspired by Neural Tangent Kernels to compute the information in the activations.
arXiv Detail & Related papers (2020-03-05T23:17:35Z) - A Theory of Usable Information Under Computational Constraints [103.5901638681034]
We propose a new framework for reasoning about information in complex systems.
Our foundation is based on a variational extension of Shannon's information theory.
We show that by incorporating computational constraints, $mathcalV$-information can be reliably estimated from data.
arXiv Detail & Related papers (2020-02-25T06:09:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.