Related papers: The Effects of Randomness on the Stability of Node Embeddings

The Effects of Randomness on the Stability of Node Embeddings

URL: http://arxiv.org/abs/2005.10039v1
Date: Wed, 20 May 2020 13:36:09 GMT
Title: The Effects of Randomness on the Stability of Node Embeddings
Authors: Tobias Schumacher, Hinrikus Wolf, Martin Ritzert, Florian Lemmerich, Jan Bachmann, Florian Frantzen, Max Klabunde, Martin Grohe, Markus Strohmaier
Abstract summary: We apply five node embeddings algorithms to graphs and evaluate their stability under randomness. We find significant instabilities in the geometry of embedding spaces independent of the centrality of a node. This suggests that instability effects need to be taken into account when working with node embeddings.
Score: 5.126380982787905
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We systematically evaluate the (in-)stability of state-of-the-art node embedding algorithms due to randomness, i.e., the random variation of their outcomes given identical algorithms and graphs. We apply five node embeddings algorithms---HOPE, LINE, node2vec, SDNE, and GraphSAGE---to synthetic and empirical graphs and assess their stability under randomness with respect to (i) the geometry of embedding spaces as well as (ii) their performance in downstream tasks. We find significant instabilities in the geometry of embedding spaces independent of the centrality of a node. In the evaluation of downstream tasks, we find that the accuracy of node classification seems to be unaffected by random seeding while the actual classification of nodes can vary significantly. This suggests that instability effects need to be taken into account when working with node embeddings. Our work is relevant for researchers and engineers interested in the effectiveness, reliability, and reproducibility of node embedding approaches.

Related papers

Matcha: Mitigating Graph Structure Shifts with Test-Time Adaptation [66.40525136929398]
Test-time adaptation (TTA) has attracted attention due to its ability to adapt a pre-trained model to a target domain, without re-accessing the source domain. We propose Matcha, an innovative framework designed for effective and efficient adaptation to structure shifts in graphs. We validate the effectiveness of Matcha on both synthetic and real-world datasets, demonstrating its robustness across various combinations of structure and attribute shifts.
arXiv Detail & Related papers (2024-10-09T15:15:40Z)
SimCalib: Graph Neural Network Calibration based on Similarity between Nodes [60.92081159963772]
Graph neural networks (GNNs) have exhibited impressive performance in modeling graph data as exemplified in various applications. We shed light on the relationship between GNN calibration and nodewise similarity via theoretical analysis. A novel calibration framework, named SimCalib, is accordingly proposed to consider similarity between nodes at global and local levels.
arXiv Detail & Related papers (2023-12-19T04:58:37Z)
Uncertainty in GNN Learning Evaluations: A Comparison Between Measures for Quantifying Randomness in GNN Community Detection [4.358468367889626]
Real-world benchmarks are perplexing due to the multitude of decisions influencing GNN evaluations. $W$ Randomness coefficient, based on the Wasserstein distance, is identified as providing the most robust assessment of randomness.
arXiv Detail & Related papers (2023-12-14T15:06:29Z)
A Topological Perspective on Demystifying GNN-Based Link Prediction Performance [72.06314265776683]
Topological Concentration (TC) is based on the intersection of the local subgraph of each node with the ones of its neighbors. We show that TC has a higher correlation with LP performance than other node-level topological metrics like degree and subgraph density. We propose Approximated Topological Concentration (ATC) and theoretically/empirically justify its efficacy in approximating TC and reducing the complexity.
arXiv Detail & Related papers (2023-10-06T22:07:49Z)
NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes. The efficient computation is enabled by a kernerlized Gumbel-Softmax operator. Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z)
A Systematic Evaluation of Node Embedding Robustness [77.29026280120277]
We assess the empirical robustness of node embedding models to random and adversarial poisoning attacks. We compare edge addition, deletion and rewiring strategies computed using network properties as well as node labels. We found that node classification suffers from higher performance degradation as opposed to network reconstruction.
arXiv Detail & Related papers (2022-09-16T17:20:23Z)
Stability of Aggregation Graph Neural Networks [153.70485149740608]
We study the stability properties of aggregation graph neural networks (Agg-GNNs) considering perturbations of the underlying graph. We prove that the stability bounds are defined by the properties of the filters in the first layer of the CNN that acts on each node. We also conclude that in Agg-GNNs the selectivity of the mapping operators is tied to the properties of the filters only in the first layer of the CNN stage.
arXiv Detail & Related papers (2022-07-08T03:54:52Z)
On the Prediction Instability of Graph Neural Networks [2.3605348648054463]
Instability of trained models can affect reliability, reliability, and trust in machine learning systems. We systematically assess the prediction instability of node classification with state-of-the-art Graph Neural Networks (GNNs) We find that up to one third of the incorrectly classified nodes differ across algorithm runs.
arXiv Detail & Related papers (2022-05-20T10:32:59Z)
Ergodic Limits, Relaxations, and Geometric Properties of Random Walk Node Embeddings [11.549910517683085]
Random walk based node embedding algorithms learn vector representations of nodes by optimizing an objective function of node embedding vectors and skip-bigram statistics computed from random walks on the network. This paper studies properties of random walk based node embeddings in the unsupervised setting of discovering hidden block structure in the network.
arXiv Detail & Related papers (2021-09-09T19:24:35Z)
Unveiling Anomalous Edges and Nominal Connectivity of Attributed Networks [53.56901624204265]
The present work deals with uncovering anomalous edges in attributed graphs using two distinct formulations with complementary strengths. The first relies on decomposing the graph data matrix into low rank plus sparse components to improve markedly performance. The second broadens the scope of the first by performing robust recovery of the unperturbed graph, which enhances the anomaly identification performance.
arXiv Detail & Related papers (2021-04-17T20:00:40Z)
Consistency of random-walk based network embedding algorithms [13.214230533788932]
We study the node2vec and DeepWalk algorithms through the perspective of matrix factorization. Our results indicate a subtle interplay between the sparsity of the observed networks, the window sizes of the random walks, and the convergence rates of the node2vec/DeepWalk embedding.
arXiv Detail & Related papers (2021-01-18T22:49:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.