Related papers: Mutual information estimation for graph convolutional neural networks

Mutual information estimation for graph convolutional neural networks

URL: http://arxiv.org/abs/2203.16887v1
Date: Thu, 31 Mar 2022 08:30:04 GMT
Title: Mutual information estimation for graph convolutional neural networks
Authors: Marius C. Landverk and Signe Riemer-S{\o}rensen
Abstract summary: We present an architecture-agnostic method for tracking a network's internal representations during training, which are then used to create a mutual information plane. We compare how the inductive bias introduced in graph-based architectures changes the mutual information plane relative to a fully connected neural network.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Measuring model performance is a key issue for deep learning practitioners. However, we often lack the ability to explain why a specific architecture attains superior predictive accuracy for a given data set. Often, validation accuracy is used as a performance heuristic quantifying how well a network generalizes to unseen data, but it does not capture anything about the information flow in the model. Mutual information can be used as a measure of the quality of internal representations in deep learning models, and the information plane may provide insights into whether the model exploits the available information in the data. The information plane has previously been explored for fully connected neural networks and convolutional architectures. We present an architecture-agnostic method for tracking a network's internal representations during training, which are then used to create the mutual information plane. The method is exemplified for graph-based neural networks fitted on citation data. We compare how the inductive bias introduced in graph-based architectures changes the mutual information plane relative to a fully connected neural network.

Related papers

Predicting Steady-State Behavior in Complex Networks with Graph Neural Networks [0.0]
In complex systems, information propagation can be defined as diffused or delocalized, weakly localized, and strongly localized. This study investigates the application of graph neural network models to learn the behavior of a linear dynamical system on networks.
arXiv Detail & Related papers (2025-02-02T17:29:10Z)
Enhancing Neural Network Interpretability Through Conductance-Based Information Plane Analysis [0.0]
The Information Plane is a conceptual framework used to analyze the flow of information in neural networks. This paper introduces a new approach that uses layer conductance, a measure of sensitivity to input features, to enhance the Information Plane analysis.
arXiv Detail & Related papers (2024-08-26T23:10:42Z)
Assessing Neural Network Representations During Training Using Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process. We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures. We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z)
Decomposing neural networks as mappings of correlation functions [57.52754806616669]
We study the mapping between probability distributions implemented by a deep feed-forward network. We identify essential statistics in the data, as well as different information representations that can be used by neural networks.
arXiv Detail & Related papers (2022-02-10T09:30:31Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
Mutual Information Scaling for Tensor Network Machine Learning [0.0]
We show how a related correlation analysis can be applied to tensor network machine learning. We explore whether classical data possess correlation scaling patterns similar to those found in quantum states. We characterize the scaling patterns in the MNIST and Tiny Images datasets, and find clear evidence of boundary-law scaling in the latter.
arXiv Detail & Related papers (2021-02-27T02:17:51Z)
Malicious Network Traffic Detection via Deep Learning: An Information Theoretic View [0.0]
We study how homeomorphism affects learned representation of a malware traffic dataset. Our results suggest that although the details of learned representations and the specific coordinate system defined over the manifold of all parameters differ slightly, the functional approximations are the same.
arXiv Detail & Related papers (2020-09-16T15:37:44Z)
Measuring Information Transfer in Neural Networks [46.37969746096677]
Quantifying the information content in a neural network model is essentially estimating the model's Kolmogorov complexity. We propose a measure of the generalizable information in a neural network model based on prequential coding. We show that $L_IT$ is consistently correlated with generalizable information and can be used as a measure of patterns or "knowledge" in a model or a dataset.
arXiv Detail & Related papers (2020-09-16T12:06:42Z)
Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training. The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z)
A Heterogeneous Graph with Factual, Temporal and Logical Knowledge for Question Answering Over Dynamic Contexts [81.4757750425247]
We study question answering over a dynamic textual environment. We develop a graph neural network over the constructed graph, and train the model in an end-to-end manner.
arXiv Detail & Related papers (2020-04-25T04:53:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.