Classic Graph Structural Features Outperform Factorization-Based Graph
Embedding Methods on Community Labeling
- URL: http://arxiv.org/abs/2201.08481v1
- Date: Thu, 20 Jan 2022 22:43:37 GMT
- Title: Classic Graph Structural Features Outperform Factorization-Based Graph
Embedding Methods on Community Labeling
- Authors: Andrew Stolman and Caleb Levy and C. Seshadhri and Aneesh Sharma
- Abstract summary: Graph representation learning (also called graph embeddings) is a popular technique for incorporating network structure into machine learning models.
We provide an empirical and theoretical analysis for the performance of a class of embeddings on the common task of pairwise community labeling.
We prove that popular low-dimensional factorization methods either cannot produce community structure, or can only produce unstable" communities.
- Score: 8.931361895613646
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph representation learning (also called graph embeddings) is a popular
technique for incorporating network structure into machine learning models.
Unsupervised graph embedding methods aim to capture graph structure by learning
a low-dimensional vector representation (the embedding) for each node. Despite
the widespread use of these embeddings for a variety of downstream transductive
machine learning tasks, there is little principled analysis of the
effectiveness of this approach for common tasks. In this work, we provide an
empirical and theoretical analysis for the performance of a class of embeddings
on the common task of pairwise community labeling. This is a binary variant of
the classic community detection problem, which seeks to build a classifier to
determine whether a pair of vertices participate in a community. In line with
our goal of foundational understanding, we focus on a popular class of
unsupervised embedding techniques that learn low rank factorizations of a
vertex proximity matrix (this class includes methods like GraRep, DeepWalk,
node2vec, NetMF). We perform detailed empirical analysis for community labeling
over a variety of real and synthetic graphs with ground truth. In all cases we
studied, the models trained from embedding features perform poorly on community
labeling. In constrast, a simple logistic model with classic graph structural
features handily outperforms the embedding models. For a more principled
understanding, we provide a theoretical analysis for the (in)effectiveness of
these embeddings in capturing the community structure. We formally prove that
popular low-dimensional factorization methods either cannot produce community
structure, or can only produce ``unstable" communities. These communities are
inherently unstable under small perturbations.
Related papers
- Curvature-based Clustering on Graphs [14.136746708037402]
We study algorithms, which exploit the geometry of the graph to identify densely connected substructures, which form clusters or communities.
Our method implements discrete Ricci curvatures and their associated geometric flows, under which the edge weights of the graph evolve to reveal its community structure.
arXiv Detail & Related papers (2023-07-19T17:35:08Z) - Bures-Wasserstein Means of Graphs [60.42414991820453]
We propose a novel framework for defining a graph mean via embeddings in the space of smooth graph signal distributions.
By finding a mean in this embedding space, we can recover a mean graph that preserves structural information.
We establish the existence and uniqueness of the novel graph mean, and provide an iterative algorithm for computing it.
arXiv Detail & Related papers (2023-05-31T11:04:53Z) - State of the Art and Potentialities of Graph-level Learning [54.68482109186052]
Graph-level learning has been applied to many tasks including comparison, regression, classification, and more.
Traditional approaches to learning a set of graphs rely on hand-crafted features, such as substructures.
Deep learning has helped graph-level learning adapt to the growing scale of graphs by extracting features automatically and encoding graphs into low-dimensional representations.
arXiv Detail & Related papers (2023-01-14T09:15:49Z) - Graph-wise Common Latent Factor Extraction for Unsupervised Graph
Representation Learning [40.70562886682939]
We propose a new principle for unsupervised graph representation learning: Graph-wise Common latent Factor EXtraction (GCFX)
GCFX explicitly extract common latent factors from an input graph and achieve improved results on downstream tasks to the current state-of-the-art.
Through extensive experiments and analysis, we demonstrate that GCFX is beneficial for graph-level tasks to alleviate distractions caused by local variations of individual nodes or local neighbourhoods.
arXiv Detail & Related papers (2021-12-16T12:22:49Z) - A Robust and Generalized Framework for Adversarial Graph Embedding [73.37228022428663]
We propose a robust framework for adversarial graph embedding, named AGE.
AGE generates the fake neighbor nodes as the enhanced negative samples from the implicit distribution.
Based on this framework, we propose three models to handle three types of graph data.
arXiv Detail & Related papers (2021-05-22T07:05:48Z) - CommPOOL: An Interpretable Graph Pooling Framework for Hierarchical
Graph Representation Learning [74.90535111881358]
We propose a new interpretable graph pooling framework - CommPOOL.
It can capture and preserve the hierarchical community structure of graphs in the graph representation learning process.
CommPOOL is a general and flexible framework for hierarchical graph representation learning.
arXiv Detail & Related papers (2020-12-10T21:14:18Z) - On the Impact of Communities on Semi-supervised Classification Using
Graph Neural Networks [0.5872014229110213]
We systematically study the impact of community structure on the performance of GNNs in semi-supervised node classification on graphs.
Our results suggest that communities typically have a major impact on the learning process and classification performance.
arXiv Detail & Related papers (2020-10-30T13:17:38Z) - Multilayer Clustered Graph Learning [66.94201299553336]
We use contrastive loss as a data fidelity term, in order to properly aggregate the observed layers into a representative graph.
Experiments show that our method leads to a clustered clusters w.r.t.
We learn a clustering algorithm for solving clustering problems.
arXiv Detail & Related papers (2020-10-29T09:58:02Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z) - Goal-directed graph construction using reinforcement learning [3.291429094499946]
We formulate the construction of a graph as a decision-making process in which a central agent creates topologies by trial and error.
We propose an algorithm based on reinforcement learning and graph neural networks to learn graph construction and improvement strategies.
arXiv Detail & Related papers (2020-01-30T12:11:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.