Mutual information estimation for graph convolutional neural networks
- URL: http://arxiv.org/abs/2203.16887v1
- Date: Thu, 31 Mar 2022 08:30:04 GMT
- Title: Mutual information estimation for graph convolutional neural networks
- Authors: Marius C. Landverk and Signe Riemer-S{\o}rensen
- Abstract summary: We present an architecture-agnostic method for tracking a network's internal representations during training, which are then used to create a mutual information plane.
We compare how the inductive bias introduced in graph-based architectures changes the mutual information plane relative to a fully connected neural network.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Measuring model performance is a key issue for deep learning practitioners.
However, we often lack the ability to explain why a specific architecture
attains superior predictive accuracy for a given data set. Often, validation
accuracy is used as a performance heuristic quantifying how well a network
generalizes to unseen data, but it does not capture anything about the
information flow in the model. Mutual information can be used as a measure of
the quality of internal representations in deep learning models, and the
information plane may provide insights into whether the model exploits the
available information in the data. The information plane has previously been
explored for fully connected neural networks and convolutional architectures.
We present an architecture-agnostic method for tracking a network's internal
representations during training, which are then used to create the mutual
information plane. The method is exemplified for graph-based neural networks
fitted on citation data. We compare how the inductive bias introduced in
graph-based architectures changes the mutual information plane relative to a
fully connected neural network.
Related papers
- Enhancing Neural Network Interpretability Through Conductance-Based Information Plane Analysis [0.0]
The Information Plane is a conceptual framework used to analyze the flow of information in neural networks.
This paper introduces a new approach that uses layer conductance, a measure of sensitivity to input features, to enhance the Information Plane analysis.
arXiv Detail & Related papers (2024-08-26T23:10:42Z) - Assessing Neural Network Representations During Training Using
Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process.
We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures.
We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z) - Decomposing neural networks as mappings of correlation functions [57.52754806616669]
We study the mapping between probability distributions implemented by a deep feed-forward network.
We identify essential statistics in the data, as well as different information representations that can be used by neural networks.
arXiv Detail & Related papers (2022-02-10T09:30:31Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Mutual Information Scaling for Tensor Network Machine Learning [0.0]
We show how a related correlation analysis can be applied to tensor network machine learning.
We explore whether classical data possess correlation scaling patterns similar to those found in quantum states.
We characterize the scaling patterns in the MNIST and Tiny Images datasets, and find clear evidence of boundary-law scaling in the latter.
arXiv Detail & Related papers (2021-02-27T02:17:51Z) - Malicious Network Traffic Detection via Deep Learning: An Information
Theoretic View [0.0]
We study how homeomorphism affects learned representation of a malware traffic dataset.
Our results suggest that although the details of learned representations and the specific coordinate system defined over the manifold of all parameters differ slightly, the functional approximations are the same.
arXiv Detail & Related papers (2020-09-16T15:37:44Z) - Measuring Information Transfer in Neural Networks [46.37969746096677]
Quantifying the information content in a neural network model is essentially estimating the model's Kolmogorov complexity.
We propose a measure of the generalizable information in a neural network model based on prequential coding.
We show that $L_IT$ is consistently correlated with generalizable information and can be used as a measure of patterns or "knowledge" in a model or a dataset.
arXiv Detail & Related papers (2020-09-16T12:06:42Z) - Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training.
The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z) - A Heterogeneous Graph with Factual, Temporal and Logical Knowledge for
Question Answering Over Dynamic Contexts [81.4757750425247]
We study question answering over a dynamic textual environment.
We develop a graph neural network over the constructed graph, and train the model in an end-to-end manner.
arXiv Detail & Related papers (2020-04-25T04:53:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.