OAG-Bench: A Human-Curated Benchmark for Academic Graph Mining
- URL: http://arxiv.org/abs/2402.15810v2
- Date: Thu, 20 Jun 2024 04:15:12 GMT
- Title: OAG-Bench: A Human-Curated Benchmark for Academic Graph Mining
- Authors: Fanjin Zhang, Shijie Shi, Yifan Zhu, Bo Chen, Yukuo Cen, Jifan Yu, Yelin Chen, Lulu Wang, Qingfei Zhao, Yuqing Cheng, Tianyi Han, Yuwei An, Dan Zhang, Weng Lam Tam, Kun Cao, Yunhe Pang, Xinyu Guan, Huihui Yuan, Jian Song, Xiaoyan Li, Yuxiao Dong, Jie Tang,
- Abstract summary: OAG-Bench is a comprehensive, multi-aspect, and fine-grained human-curated benchmark based on the Open Academic Graph (OAG)
OAG-Bench covers 10 tasks, 20 datasets, 70+ baselines, and 120+ experimental results to date.
- Score: 46.27513006781531
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid proliferation of scientific literature, versatile academic knowledge services increasingly rely on comprehensive academic graph mining. Despite the availability of public academic graphs, benchmarks, and datasets, these resources often fall short in multi-aspect and fine-grained annotations, are constrained to specific task types and domains, or lack underlying real academic graphs. In this paper, we present OAG-Bench, a comprehensive, multi-aspect, and fine-grained human-curated benchmark based on the Open Academic Graph (OAG). OAG-Bench covers 10 tasks, 20 datasets, 70+ baselines, and 120+ experimental results to date. We propose new data annotation strategies for certain tasks and offer a suite of data pre-processing codes, algorithm implementations, and standardized evaluation protocols to facilitate academic graph mining. Extensive experiments reveal that even advanced algorithms like large language models (LLMs) encounter difficulties in addressing key challenges in certain tasks, such as paper source tracing and scholar profiling. We also introduce the Open Academic Graph Challenge (OAG-Challenge) to encourage community input and sharing. We envisage that OAG-Bench can serve as a common ground for the community to evaluate and compare algorithms in academic graph mining, thereby accelerating algorithm development and advancement in this field. OAG-Bench is accessible at https://www.aminer.cn/data/.
Related papers
- Graph Domain Adaptation: Challenges, Progress and Prospects [61.9048172631524]
We propose graph domain adaptation as an effective knowledge-transfer paradigm across graphs.
GDA introduces a bunch of task-related graphs as source graphs and adapts the knowledge learnt from source graphs to the target graphs.
We outline the research status and challenges, propose a taxonomy, introduce the details of representative works, and discuss the prospects.
arXiv Detail & Related papers (2024-02-01T02:44:32Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - Counterfactual Learning on Graphs: A Survey [34.47646823407408]
Graph neural networks (GNNs) have achieved great success in representation learning on graphs.
Counterfactual learning on graphs has shown promising results in alleviating these drawbacks.
Various approaches have been proposed for counterfactual fairness, explainability, link prediction and other applications on graphs.
arXiv Detail & Related papers (2023-04-03T21:42:42Z) - PubGraph: A Large-Scale Scientific Knowledge Graph [11.240833731512609]
PubGraph is a new resource for studying scientific progress that takes the form of a large-scale knowledge graph.
PubGraph is comprehensive and unifies data from various sources, including Wikidata, OpenAlex, and Semantic Scholar.
We create several large-scale benchmarks extracted from PubGraph for the core task of knowledge graph completion.
arXiv Detail & Related papers (2023-02-04T20:03:55Z) - Few-Shot Learning on Graphs: A Survey [92.47605211946149]
Graph representation learning has attracted tremendous attention due to its remarkable performance in many real-world applications.
semi-supervised graph representation learning models for specific tasks often suffer from label sparsity issue.
Few-shot learning on graphs (FSLG) has been proposed to tackle the performance degradation in face of limited annotated data challenge.
arXiv Detail & Related papers (2022-03-17T13:21:11Z) - GRAPE for Fast and Scalable Graph Processing and random walk-based
Embedding [0.5035217505850539]
We present GRAPE, a software resource for graph processing and embedding.
It can scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random walk-based methods.
arXiv Detail & Related papers (2021-10-12T17:49:46Z) - DIG: A Turnkey Library for Diving into Graph Deep Learning Research [39.58666190541479]
DIG: Dive into Graphs is a research-oriented library that integrates and unified implementations of common graph deep learning algorithms for several advanced tasks.
For each direction, we provide unified implementations of data interfaces, common algorithms, and evaluation metrics.
arXiv Detail & Related papers (2021-03-23T15:05:10Z) - CogDL: A Comprehensive Library for Graph Deep Learning [55.694091294633054]
We present CogDL, a library for graph deep learning that allows researchers and practitioners to conduct experiments, compare methods, and build applications with ease and efficiency.
In CogDL, we propose a unified design for the training and evaluation of GNN models for various graph tasks, making it unique among existing graph learning libraries.
We develop efficient sparse operators for CogDL, enabling it to become the most competitive graph library for efficiency.
arXiv Detail & Related papers (2021-03-01T12:35:16Z) - Inverse Graph Identification: Can We Identify Node Labels Given Graph
Labels? [89.13567439679709]
Graph Identification (GI) has long been researched in graph learning and is essential in certain applications.
This paper defines a novel problem dubbed Inverse Graph Identification (IGI)
We propose a simple yet effective method that makes the node-level message passing process using Graph Attention Network (GAT) under the protocol of GI.
arXiv Detail & Related papers (2020-07-12T12:06:17Z) - SIGN: Scalable Inception Graph Neural Networks [4.5158585619109495]
We propose a new, efficient and scalable graph deep learning architecture that sidesteps the need for graph sampling.
Our architecture allows using different local graph operators to best suit the task at hand.
We obtain state-of-the-art results on ogbn-papers100M, the largest public graph dataset, with over 110 million nodes and 1.5 billion edges.
arXiv Detail & Related papers (2020-04-23T14:46:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.