AutoG: Towards automatic graph construction from tabular data
- URL: http://arxiv.org/abs/2501.15282v1
- Date: Sat, 25 Jan 2025 17:31:56 GMT
- Title: AutoG: Towards automatic graph construction from tabular data
- Authors: Zhikai Chen, Han Xie, Jian Zhang, Xiang song, Jiliang Tang, Huzefa Rangwala, George Karypis,
- Abstract summary: We introduce a set of datasets to formalize and evaluate graph construction methods.
We propose an LLM-based solution, AutoG, which automatically generates high-quality graph schemas without human intervention.
- Score: 60.877867570524884
- License:
- Abstract: Recent years have witnessed significant advancements in graph machine learning (GML), with its applications spanning numerous domains. However, the focus of GML has predominantly been on developing powerful models, often overlooking a crucial initial step: constructing suitable graphs from common data formats, such as tabular data. This construction process is fundamental to applying graphbased models, yet it remains largely understudied and lacks formalization. Our research aims to address this gap by formalizing the graph construction problem and proposing an effective solution. We identify two critical challenges to achieve this goal: 1. The absence of dedicated datasets to formalize and evaluate the effectiveness of graph construction methods, and 2. Existing automatic construction methods can only be applied to some specific cases, while tedious human engineering is required to generate high-quality graphs. To tackle these challenges, we present a two-fold contribution. First, we introduce a set of datasets to formalize and evaluate graph construction methods. Second, we propose an LLM-based solution, AutoG, automatically generating high-quality graph schemas without human intervention. The experimental results demonstrate that the quality of constructed graphs is critical to downstream task performance, and AutoG can generate high-quality graphs that rival those produced by human experts.
Related papers
- Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs)
This framework provides a standardized setting to evaluate GNNs across diverse datasets.
We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z) - Towards Data-centric Machine Learning on Directed Graphs: a Survey [23.498557237805414]
We introduce a novel taxonomy for existing studies of directed graph learning.
We re-examine these methods from the data-centric perspective, with an emphasis on understanding and improving data representation.
We identify key opportunities and challenges within the field, offering insights that can guide future research and development in directed graph learning.
arXiv Detail & Related papers (2024-11-28T06:09:12Z) - Towards Data-centric Graph Machine Learning: Review and Outlook [120.64417630324378]
We introduce a systematic framework, Data-centric Graph Machine Learning (DC-GML), that encompasses all stages of the graph data lifecycle.
A thorough taxonomy of each stage is presented to answer three critical graph-centric questions.
We pinpoint the future prospects of the DC-GML domain, providing insights to navigate its advancements and applications.
arXiv Detail & Related papers (2023-09-20T00:40:13Z) - Let There Be Order: Rethinking Ordering in Autoregressive Graph
Generation [6.422073551199993]
Conditional graph generation tasks involve training a model to generate a graph given a set of input conditions.
Many previous studies employ autoregressive models to incrementally generate graph components such as nodes and edges.
As graphs typically lack a natural ordering among their components, converting a graph into a sequence of tokens is not straightforward.
arXiv Detail & Related papers (2023-05-24T20:52:34Z) - SCGG: A Deep Structure-Conditioned Graph Generative Model [9.046174529859524]
A conditional deep graph generation method called SCGG considers a particular type of structural conditions.
The architecture of SCGG consists of a graph representation learning network and an autoregressive generative model, which is trained end-to-end.
Experimental results on both synthetic and real-world datasets demonstrate the superiority of our method compared with state-of-the-art baselines.
arXiv Detail & Related papers (2022-09-20T12:33:50Z) - GraphMAE: Self-Supervised Masked Graph Autoencoders [52.06140191214428]
We present a masked graph autoencoder GraphMAE that mitigates issues for generative self-supervised graph learning.
We conduct extensive experiments on 21 public datasets for three different graph learning tasks.
The results manifest that GraphMAE--a simple graph autoencoder with our careful designs--can consistently generate outperformance over both contrastive and generative state-of-the-art baselines.
arXiv Detail & Related papers (2022-05-22T11:57:08Z) - Data Augmentation for Deep Graph Learning: A Survey [66.04015540536027]
We first propose a taxonomy for graph data augmentation and then provide a structured review by categorizing the related work based on the augmented information modalities.
Focusing on the two challenging problems in DGL (i.e., optimal graph learning and low-resource graph learning), we also discuss and review the existing learning paradigms which are based on graph data augmentation.
arXiv Detail & Related papers (2022-02-16T18:30:33Z) - Structural Information Preserving for Graph-to-Text Generation [59.00642847499138]
The task of graph-to-text generation aims at producing sentences that preserve the meaning of input graphs.
We propose to tackle this problem by leveraging richer training signals that can guide our model for preserving input information.
Experiments on two benchmarks for graph-to-text generation show the effectiveness of our approach over a state-of-the-art baseline.
arXiv Detail & Related papers (2021-02-12T20:09:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.