Predicting the generalization gap in neural networks using topological
data analysis
- URL: http://arxiv.org/abs/2203.12330v2
- Date: Sat, 12 Aug 2023 09:23:59 GMT
- Title: Predicting the generalization gap in neural networks using topological
data analysis
- Authors: Rub\'en Ballester, Xavier Arnal Clemente, Carles Casacuberta, Meysam
Madadi, Ciprian A. Corneanu, Sergio Escalera
- Abstract summary: We study the generalization gap of neural networks using methods from topological data analysis.
We compute homological persistence diagrams of weighted graphs constructed from neuron activation correlations after a training phase.
We compare the usefulness of different numerical summaries from persistence diagrams and show that a combination of some of them can accurately predict and partially explain the generalization gap without the need of a test set.
- Score: 33.511371257571504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding how neural networks generalize on unseen data is crucial for
designing more robust and reliable models. In this paper, we study the
generalization gap of neural networks using methods from topological data
analysis. For this purpose, we compute homological persistence diagrams of
weighted graphs constructed from neuron activation correlations after a
training phase, aiming to capture patterns that are linked to the
generalization capacity of the network. We compare the usefulness of different
numerical summaries from persistence diagrams and show that a combination of
some of them can accurately predict and partially explain the generalization
gap without the need of a test set. Evaluation on two computer vision
recognition tasks (CIFAR10 and SVHN) shows competitive generalization gap
prediction when compared against state-of-the-art methods.
Related papers
- Generalization Error of Graph Neural Networks in the Mean-field Regime [10.35214360391282]
We explore two widely utilized types of graph neural networks: graph convolutional neural networks and message passing graph neural networks.
Our novel approach involves deriving upper bounds within the mean-field regime for evaluating the generalization error of these graph neural networks.
arXiv Detail & Related papers (2024-02-10T19:12:31Z) - On Discprecncies between Perturbation Evaluations of Graph Neural
Network Attributions [49.8110352174327]
We assess attribution methods from a perspective not previously explored in the graph domain: retraining.
The core idea is to retrain the network on important (or not important) relationships as identified by the attributions.
We run our analysis on four state-of-the-art GNN attribution methods and five synthetic and real-world graph classification datasets.
arXiv Detail & Related papers (2024-01-01T02:03:35Z) - Neural Tangent Kernels Motivate Graph Neural Networks with
Cross-Covariance Graphs [94.44374472696272]
We investigate NTKs and alignment in the context of graph neural networks (GNNs)
Our results establish the theoretical guarantees on the optimality of the alignment for a two-layer GNN.
These guarantees are characterized by the graph shift operator being a function of the cross-covariance between the input and the output data.
arXiv Detail & Related papers (2023-10-16T19:54:21Z) - Generalization bound for estimating causal effects from observational
network data [25.055822137402746]
We derive a generalization bound for causal effect estimation in network scenarios by exploiting 1) the reweighting schema based on joint propensity score and 2) the representation learning schema based on Integral Probability Metric (IPM)
Motivated by the analysis of the bound, we propose a weighting regression method based on the joint propensity score augmented with representation learning.
arXiv Detail & Related papers (2023-08-08T03:14:34Z) - Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs)
It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z) - Homophily modulates double descent generalization in graph convolution
networks [33.703222768801574]
We show how risk is shaped by the interplay between the graph noise, feature noise, and the number of training labels.
We use our analytic insights to improve performance of state-of-the-art graph convolution networks on heterophilic datasets.
arXiv Detail & Related papers (2022-12-26T09:57:09Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - With Greater Distance Comes Worse Performance: On the Perspective of
Layer Utilization and Model Generalization [3.6321778403619285]
Generalization of deep neural networks remains one of the main open problems in machine learning.
Early layers generally learn representations relevant to performance on both training data and testing data.
Deeper layers only minimize training risks and fail to generalize well with testing or mislabeled data.
arXiv Detail & Related papers (2022-01-28T05:26:32Z) - Persistent Homology Captures the Generalization of Neural Networks
Without A Validation Set [0.0]
We suggest studying the training of neural networks with Algebraic Topology, specifically Persistent Homology.
Using simplicial complex representations of neural networks, we study the PH diagram distance evolution on the neural network learning process.
Results show that the PH diagram distance between consecutive neural network states correlates with the validation accuracy.
arXiv Detail & Related papers (2021-05-31T09:17:31Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective.
We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.