GraphGen: A Scalable Approach to Domain-agnostic Labeled Graph
Generation
- URL: http://arxiv.org/abs/2001.08184v2
- Date: Wed, 8 Apr 2020 13:18:05 GMT
- Title: GraphGen: A Scalable Approach to Domain-agnostic Labeled Graph
Generation
- Authors: Nikhil Goyal, Harsh Vardhan Jain, Sayan Ranu
- Abstract summary: Graph generative models have been extensively studied in the data mining literature.
Recent techniques have shifted towards learning this distribution directly from the data.
In this work, we develop a domain-agnostic technique called GraphGen to overcome all of these limitations.
- Score: 5.560715621814096
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph generative models have been extensively studied in the data mining
literature. While traditional techniques are based on generating structures
that adhere to a pre-decided distribution, recent techniques have shifted
towards learning this distribution directly from the data. While learning-based
approaches have imparted significant improvement in quality, some limitations
remain to be addressed. First, learning graph distributions introduces
additional computational overhead, which limits their scalability to large
graph databases. Second, many techniques only learn the structure and do not
address the need to also learn node and edge labels, which encode important
semantic information and influence the structure itself. Third, existing
techniques often incorporate domain-specific rules and lack generalizability.
Fourth, the experimentation of existing techniques is not comprehensive enough
due to either using weak evaluation metrics or focusing primarily on synthetic
or small datasets. In this work, we develop a domain-agnostic technique called
GraphGen to overcome all of these limitations. GraphGen converts graphs to
sequences using minimum DFS codes. Minimum DFS codes are canonical labels and
capture the graph structure precisely along with the label information. The
complex joint distributions between structure and semantic labels are learned
through a novel LSTM architecture. Extensive experiments on million-sized, real
graph datasets show GraphGen to be 4 times faster on average than
state-of-the-art techniques while being significantly better in quality across
a comprehensive set of 11 different metrics. Our code is released at
https://github.com/idea-iitd/graphgen.
Related papers
- GraphCLIP: Enhancing Transferability in Graph Foundation Models for Text-Attributed Graphs [27.169892145194638]
GraphCLIP is a framework to learn graph foundation models with strong cross-domain zero/few-shot transferability.
We generate and curate large-scale graph-summary pair data with the assistance of LLMs.
For few-shot learning, we propose a novel graph prompt tuning technique aligned with our pretraining objective.
arXiv Detail & Related papers (2024-10-14T09:40:52Z) - Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding [51.75091298017941]
This paper proposes a novel Deep Manifold (Variational) Graph Auto-Encoder (DMVGAE/DMGAE) for attributed graph data.
The proposed method surpasses state-of-the-art baseline algorithms by a significant margin on different downstream tasks across popular datasets.
arXiv Detail & Related papers (2024-01-12T17:57:07Z) - From Cluster Assumption to Graph Convolution: Graph-based Semi-Supervised Learning Revisited [51.24526202984846]
Graph-based semi-supervised learning (GSSL) has long been a hot research topic.
graph convolutional networks (GCNs) have become the predominant techniques for their promising performance.
arXiv Detail & Related papers (2023-09-24T10:10:21Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Bringing Your Own View: Graph Contrastive Learning without Prefabricated
Data Augmentations [94.41860307845812]
Self-supervision is recently surging at its new frontier of graph learning.
GraphCL uses a prefabricated prior reflected by the ad-hoc manual selection of graph data augmentations.
We have extended the prefabricated discrete prior in the augmentation set, to a learnable continuous prior in the parameter space of graph generators.
We have leveraged both principles of information minimization (InfoMin) and information bottleneck (InfoBN) to regularize the learned priors.
arXiv Detail & Related papers (2022-01-04T15:49:18Z) - GraphGen-Redux: a Fast and Lightweight Recurrent Model for labeled Graph
Generation [13.956691231452336]
We present a novel graph preprocessing approach for labeled graph generation.
By introducing a novel graph preprocessing approach, we are able to process the labeling information of both nodes and edges jointly.
The corresponding model, which we term GraphGen-Redux, improves upon the generative performances of GraphGen in a wide range of datasets.
arXiv Detail & Related papers (2021-07-18T09:26:10Z) - Multi-Level Graph Contrastive Learning [38.022118893733804]
We propose a Multi-Level Graph Contrastive Learning (MLGCL) framework for learning robust representation of graph data by contrasting space views of graphs.
The original graph is first-order approximation structure and contains uncertainty or error, while the $k$NN graph generated by encoding features preserves high-order proximity.
Extensive experiments indicate MLGCL achieves promising results compared with the existing state-of-the-art graph representation learning methods on seven datasets.
arXiv Detail & Related papers (2021-07-06T14:24:43Z) - Co-embedding of Nodes and Edges with Graph Neural Networks [13.020745622327894]
Graph embedding is a way to transform and encode the data structure in high dimensional and non-Euclidean feature space.
CensNet is a general graph embedding framework, which embeds both nodes and edges to a latent feature space.
Our approach achieves or matches the state-of-the-art performance in four graph learning tasks.
arXiv Detail & Related papers (2020-10-25T22:39:31Z) - Graph Contrastive Learning with Augmentations [109.23158429991298]
We propose a graph contrastive learning (GraphCL) framework for learning unsupervised representations of graph data.
We show that our framework can produce graph representations of similar or better generalizability, transferrability, and robustness compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-10-22T20:13:43Z) - Heuristic Semi-Supervised Learning for Graph Generation Inspired by
Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph.
In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z) - Adaptive-Step Graph Meta-Learner for Few-Shot Graph Classification [25.883839335786025]
We propose a novel framework consisting of a graph meta-learner, which uses GNNs based modules for fast adaptation on graph data.
Our framework gets state-of-the-art results on several few-shot graph classification tasks compared to baselines.
arXiv Detail & Related papers (2020-03-18T14:38:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.