AutoG: Towards automatic graph construction from tabular data
- URL: http://arxiv.org/abs/2501.15282v3
- Date: Wed, 05 Mar 2025 03:38:57 GMT
- Title: AutoG: Towards automatic graph construction from tabular data
- Authors: Zhikai Chen, Han Xie, Jian Zhang, Xiang song, Jiliang Tang, Huzefa Rangwala, George Karypis,
- Abstract summary: We aim to formalize the graph construction problem and propose an effective solution.<n>Existing automatic construction methods can only be applied to some specific cases.<n>We present a set of datasets to formalize and evaluate graph construction methods.<n>Second, we propose an LLM-based solution, AutoG, automatically generating high-quality graph schemas.
- Score: 60.877867570524884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent years have witnessed significant advancements in graph machine learning (GML), with its applications spanning numerous domains. However, the focus of GML has predominantly been on developing powerful models, often overlooking a crucial initial step: constructing suitable graphs from common data formats, such as tabular data. This construction process is fundamental to applying graph-based models, yet it remains largely understudied and lacks formalization. Our research aims to address this gap by formalizing the graph construction problem and proposing an effective solution. We identify two critical challenges to achieve this goal: 1. The absence of dedicated datasets to formalize and evaluate the effectiveness of graph construction methods, and 2. Existing automatic construction methods can only be applied to some specific cases, while tedious human engineering is required to generate high-quality graphs. To tackle these challenges, we present a two-fold contribution. First, we introduce a set of datasets to formalize and evaluate graph construction methods. Second, we propose an LLM-based solution, AutoG, automatically generating high-quality graph schemas without human intervention. The experimental results demonstrate that the quality of constructed graphs is critical to downstream task performance, and AutoG can generate high-quality graphs that rival those produced by human experts. Our code can be accessible from https://github.com/amazon-science/Automatic-Table-to-Graph-Generation.
Related papers
- Graph Generative Pre-trained Transformer [25.611007241470645]
This work revisits an alternative approach that represents graphs as sequences of node set and edge set.
We introduce the Graph Generative Pre-trained Transformer (G2PT), an auto-regressive model that learns graph structures via next-token prediction.
G2PT achieves superior generative performance on both generic graph and molecule datasets.
arXiv Detail & Related papers (2025-01-02T05:44:11Z) - Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs)<n>This framework provides a standardized setting to evaluate GNNs across diverse datasets.<n>We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z) - An Automatic Graph Construction Framework based on Large Language Models for Recommendation [49.51799417575638]
We introduce AutoGraph, an automatic graph construction framework based on large language models for recommendation.
LLMs infer the user preference and item knowledge, which is encoded as semantic vectors.
Latent factors are incorporated as extra nodes to link the user/item nodes, resulting in a graph with in-depth global-view semantics.
arXiv Detail & Related papers (2024-12-24T07:51:29Z) - Towards Data-centric Machine Learning on Directed Graphs: a Survey [23.498557237805414]
We introduce a novel taxonomy for existing studies of directed graph learning.<n>We re-examine these methods from the data-centric perspective, with an emphasis on understanding and improving data representation.<n>We identify key opportunities and challenges within the field, offering insights that can guide future research and development in directed graph learning.
arXiv Detail & Related papers (2024-11-28T06:09:12Z) - OpenGraph: Towards Open Graph Foundation Models [20.401374302429627]
Graph Neural Networks (GNNs) have emerged as promising techniques for encoding structural information.
Key challenge remains: the difficulty of generalizing to unseen graph data with different properties.
We propose a novel graph foundation model, called OpenGraph, to address this challenge.
arXiv Detail & Related papers (2024-03-02T08:05:03Z) - Let There Be Order: Rethinking Ordering in Autoregressive Graph
Generation [6.422073551199993]
Conditional graph generation tasks involve training a model to generate a graph given a set of input conditions.
Many previous studies employ autoregressive models to incrementally generate graph components such as nodes and edges.
As graphs typically lack a natural ordering among their components, converting a graph into a sequence of tokens is not straightforward.
arXiv Detail & Related papers (2023-05-24T20:52:34Z) - Graph Generative Model for Benchmarking Graph Neural Networks [73.11514658000547]
We introduce a novel graph generative model that learns and reproduces the distribution of real-world graphs in a privacy-controlled way.
Our model can successfully generate privacy-controlled, synthetic substitutes of large-scale real-world graphs that can be effectively used to benchmark GNN models.
arXiv Detail & Related papers (2022-07-10T06:42:02Z) - GraphMAE: Self-Supervised Masked Graph Autoencoders [52.06140191214428]
We present a masked graph autoencoder GraphMAE that mitigates issues for generative self-supervised graph learning.
We conduct extensive experiments on 21 public datasets for three different graph learning tasks.
The results manifest that GraphMAE--a simple graph autoencoder with our careful designs--can consistently generate outperformance over both contrastive and generative state-of-the-art baselines.
arXiv Detail & Related papers (2022-05-22T11:57:08Z) - Data Augmentation for Deep Graph Learning: A Survey [66.04015540536027]
We first propose a taxonomy for graph data augmentation and then provide a structured review by categorizing the related work based on the augmented information modalities.
Focusing on the two challenging problems in DGL (i.e., optimal graph learning and low-resource graph learning), we also discuss and review the existing learning paradigms which are based on graph data augmentation.
arXiv Detail & Related papers (2022-02-16T18:30:33Z) - Structural Information Preserving for Graph-to-Text Generation [59.00642847499138]
The task of graph-to-text generation aims at producing sentences that preserve the meaning of input graphs.
We propose to tackle this problem by leveraging richer training signals that can guide our model for preserving input information.
Experiments on two benchmarks for graph-to-text generation show the effectiveness of our approach over a state-of-the-art baseline.
arXiv Detail & Related papers (2021-02-12T20:09:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.