Related papers: GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph Learning

GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph Learning

URL: http://arxiv.org/abs/2507.03267v1
Date: Fri, 04 Jul 2025 02:55:32 GMT
Title: GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph Learning
Authors: Jie Peng, Jiarui Ji, Runlin Lei, Zhewei Wei, Yongchao Liu, Chuntao Hong,
Abstract summary: Generative DyTAG Benchmark (GDGB) comprises eight meticulously curated DyTAG datasets with high-quality textual features.<n>Building on GDGB, we define two novel DyTAG generation tasks: Transductive Dynamic Graph Generation (TDGG) and Inductive Dynamic Graph Generation (IDGG)
Score: 25.487795050809503
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Dynamic Text-Attributed Graphs (DyTAGs), which intricately integrate structural, temporal, and textual attributes, are crucial for modeling complex real-world systems. However, most of the existing DyTAG datasets exhibit poor textual quality, which severely limits their utility for DyTAG generation tasks requiring semantically rich inputs. Additionally, prior work mainly focuses on discriminative tasks on DyTAGs, resulting in a lack of standardized task formulations and evaluation protocols tailored for DyTAG generation. To address these critical issues, we propose Generative DyTAG Benchmark (GDGB), which comprises eight meticulously curated DyTAG datasets with high-quality textual features for both nodes and edges, overcoming limitations of prior datasets. Building on GDGB, we define two novel DyTAG generation tasks: Transductive Dynamic Graph Generation (TDGG) and Inductive Dynamic Graph Generation (IDGG). TDGG transductively generates a target DyTAG based on the given source and destination node sets, while the more challenging IDGG introduces new node generation to inductively model the dynamic expansion of real-world graph data. To enable holistic evaluation, we design multifaceted metrics that assess the structural, temporal, and textual quality of the generated DyTAGs. We further propose GAG-General, an LLM-based multi-agent generative framework tailored for reproducible and robust benchmarking of DyTAG generation. Experimental results demonstrate that GDGB enables rigorous evaluation of TDGG and IDGG, with key insights revealing the critical interplay of structural and textual features in DyTAG generation. These findings establish GDGB as a foundational resource for advancing generative DyTAG research and unlocking further practical applications in DyTAG generation. GDGB datasets, source codes, and leaderboards are available at \href{https://gdgb-algo.github.io/}{here}.

Related papers

H$^2$GFM: Towards unifying Homogeneity and Heterogeneity on Text-Attributed Graphs [6.601515580215021]
We introduce H$2$GFM, a novel framework designed to generalize across both HoTAGs and HeTAGs.<n>Our model projects diverse meta-relations among graphs under a unified textual space.<n>We employ a mixture of CGT experts to capture the heterogeneity in structural patterns among graph types.
arXiv Detail & Related papers (2025-06-10T00:03:56Z)
Toward General and Robust LLM-enhanced Text-attributed Graph Learning [29.55905028870534]
UltraTAG is a unified pipeline for LLM-enhanced TAG learning.<n>UltraTAG-S is a robust instantiation designed to tackle the inherent sparsity issues in real-world TAGs.<n>Our experiments demonstrate that UltraTAG-S significantly outperforms existing baselines.
arXiv Detail & Related papers (2025-04-03T07:24:18Z)
LLM as GNN: Graph Vocabulary Learning for Text-Attributed Graph Foundation Models [54.82915844507371]
Text-Attributed Graphs (TAGs) are ubiquitous in real-world scenarios.<n>Despite large efforts to integrate Large Language Models (LLMs) and Graph Neural Networks (GNNs) for TAGs, existing approaches suffer from decoupled architectures.<n>We propose PromptGFM, a versatile GFM for TAGs grounded in graph vocabulary learning.
arXiv Detail & Related papers (2025-03-05T09:45:22Z)
Retrieval-Augmented Generation with Graphs (GraphRAG) [84.29507404866257]
Retrieval-augmented generation (RAG) is a powerful technique that enhances downstream task execution by retrieving additional information.<n>Graph, by its intrinsic "nodes connected by edges" nature, encodes massive heterogeneous and relational information.<n>Unlike conventional RAG, the uniqueness of graph-structured data, such as diverse-formatted and domain-specific relational knowledge, poses unique and significant challenges when designing GraphRAG for different domains.
arXiv Detail & Related papers (2024-12-31T06:59:35Z)
Multi-Scale Heterogeneous Text-Attributed Graph Datasets From Diverse Domains [25.61868709829681]
We introduce a collection of challenging and diverse benchmark datasets for realistic and reproducible evaluation of machine learning models on HTAGs.<n>Our HTAG datasets are multi-scale, span years in duration, and cover a wide range of domains, including movie, community question answering, academic, literature, and patent networks.<n>All source data, dataset construction codes, processed HTAGs, data loaders, benchmark codes, and evaluation setup are publicly available at GitHub and Hugging Face.
arXiv Detail & Related papers (2024-12-12T04:58:32Z)
Bridging Local Details and Global Context in Text-Attributed Graphs [62.522550655068336]
GraphBridge is a framework that bridges local and global perspectives by leveraging contextual textual information. Our method achieves state-of-theart performance, while our graph-aware token reduction module significantly enhances efficiency and solves scalability issues.
arXiv Detail & Related papers (2024-06-18T13:35:25Z)
DTGB: A Comprehensive Benchmark for Dynamic Text-Attributed Graphs [28.340416573162898]
Dynamic text-attributed graphs (DyTAGs) are prevalent in various real-world scenarios. Despite their broad applicability, there is a notable scarcity of benchmark datasets tailored to DyTAGs. We introduce Dynamic Text-attributed Graph Benchmark (DTGB), a collection of large-scale, time-evolving graphs.
arXiv Detail & Related papers (2024-06-17T20:16:12Z)
TEG-DB: A Comprehensive Dataset and Benchmark of Textual-Edge Graphs [14.437863803271808]
Text-Attributed Graphs (TAGs) augment graph structures with natural language descriptions, facilitating detailed depictions of data and their interconnections. Existing TAG datasets predominantly feature textual information only at the nodes, with edges typically represented by mere binary or categorical attributes. To address this gap, we introduce Textual-Edge Graphs datasets featuring rich textual descriptions on nodes and edges.
arXiv Detail & Related papers (2024-06-14T06:22:47Z)
Learning Multiplex Representations on Text-Attributed Graphs with One Language Model Encoder [55.24276913049635]
We propose METAG, a new framework for learning Multiplex rEpresentations on Text-Attributed Graphs. In contrast to existing methods, METAG uses one text encoder to model the shared knowledge across relations. We conduct experiments on nine downstream tasks in five graphs from both academic and e-commerce domains.
arXiv Detail & Related papers (2023-10-10T14:59:22Z)
Deliberate then Generate: Enhanced Prompting Framework for Text Generation [70.10319005141888]
Deliberate then Generate (DTG) prompting framework consists of error detection instructions and candidates that may contain errors. We conduct extensive experiments on 20+ datasets across 7 text generation tasks, including summarization, translation, dialogue, and more. We show that DTG consistently outperforms existing prompting methods and achieves state-of-the-art performance on multiple text generation tasks.
arXiv Detail & Related papers (2023-05-31T13:23:04Z)
Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning [51.90524745663737]
A key innovation is our use of explanations as features, which can be used to boost GNN performance on downstream tasks. Our method achieves state-of-the-art results on well-established TAG datasets. Our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv.
arXiv Detail & Related papers (2023-05-31T03:18:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.