PubGraph: A Large-Scale Scientific Knowledge Graph
- URL: http://arxiv.org/abs/2302.02231v2
- Date: Fri, 19 May 2023 04:56:47 GMT
- Title: PubGraph: A Large-Scale Scientific Knowledge Graph
- Authors: Kian Ahrabian, Xinwei Du, Richard Delwin Myloth, Arun Baalaaji Sankar
Ananthan, Jay Pujara
- Abstract summary: PubGraph is a new resource for studying scientific progress that takes the form of a large-scale knowledge graph.
PubGraph is comprehensive and unifies data from various sources, including Wikidata, OpenAlex, and Semantic Scholar.
We create several large-scale benchmarks extracted from PubGraph for the core task of knowledge graph completion.
- Score: 11.240833731512609
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Research publications are the primary vehicle for sharing scientific progress
in the form of new discoveries, methods, techniques, and insights.
Unfortunately, the lack of a large-scale, comprehensive, and easy-to-use
resource capturing the myriad relationships between publications, their
authors, and venues presents a barrier to applications for gaining a deeper
understanding of science. In this paper, we present PubGraph, a new resource
for studying scientific progress that takes the form of a large-scale knowledge
graph (KG) with more than 385M entities, 13B main edges, and 1.5B qualifier
edges. PubGraph is comprehensive and unifies data from various sources,
including Wikidata, OpenAlex, and Semantic Scholar, using the Wikidata
ontology. Beyond the metadata available from these sources, PubGraph includes
outputs from auxiliary community detection algorithms and large language
models. To further support studies on reasoning over scientific networks, we
create several large-scale benchmarks extracted from PubGraph for the core task
of knowledge graph completion (KGC). These benchmarks present many challenges
for knowledge graph embedding models, including an adversarial community-based
KGC evaluation setting, zero-shot inductive learning, and large-scale learning.
All of the aforementioned resources are accessible at https://pubgraph.isi.edu/
and released under the CC-BY-SA license. We plan to update PubGraph quarterly
to accommodate the release of new publications.
Related papers
- Can Large Language Models Analyze Graphs like Professionals? A Benchmark, Datasets and Models [90.98855064914379]
We introduce ProGraph, a benchmark for large language models (LLMs) to process graphs.
Our findings reveal that the performance of current LLMs is unsatisfactory, with the best model achieving only 36% accuracy.
We propose LLM4Graph datasets, which include crawled documents and auto-generated codes based on 6 widely used graph libraries.
arXiv Detail & Related papers (2024-09-29T11:38:45Z) - OAG-Bench: A Human-Curated Benchmark for Academic Graph Mining [46.27513006781531]
OAG-Bench is a comprehensive, multi-aspect, and fine-grained human-curated benchmark based on the Open Academic Graph (OAG)
OAG-Bench covers 10 tasks, 20 datasets, 70+ baselines, and 120+ experimental results to date.
arXiv Detail & Related papers (2024-02-24T13:15:54Z) - Graph Domain Adaptation: Challenges, Progress and Prospects [61.9048172631524]
We propose graph domain adaptation as an effective knowledge-transfer paradigm across graphs.
GDA introduces a bunch of task-related graphs as source graphs and adapts the knowledge learnt from source graphs to the target graphs.
We outline the research status and challenges, propose a taxonomy, introduce the details of representative works, and discuss the prospects.
arXiv Detail & Related papers (2024-02-01T02:44:32Z) - Universal Knowledge Graph Embeddings [4.322134229203427]
We propose to learn universal knowledge graph embeddings from large-scale knowledge sources.
We instantiate our idea by computing universal embeddings based on DBpedia and Wikidata for about 180 million entities, 15 thousand relations, and 1.2 billion triples.
arXiv Detail & Related papers (2023-10-23T13:07:46Z) - Counterfactual Learning on Graphs: A Survey [34.47646823407408]
Graph neural networks (GNNs) have achieved great success in representation learning on graphs.
Counterfactual learning on graphs has shown promising results in alleviating these drawbacks.
Various approaches have been proposed for counterfactual fairness, explainability, link prediction and other applications on graphs.
arXiv Detail & Related papers (2023-04-03T21:42:42Z) - The Semantic Scholar Open Data Platform [79.4493235243312]
Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping scholars discover and understand scientific literature.
We combine public and proprietary data sources using state-of-the-art techniques for scholarly PDF content extraction and automatic knowledge graph construction.
The graph includes advanced semantic features such as structurally parsed text, natural language summaries, and vector embeddings.
arXiv Detail & Related papers (2023-01-24T17:13:08Z) - A Survey of Knowledge Graph Reasoning on Graph Types: Static, Dynamic,
and Multimodal [57.8455911689554]
Knowledge graph reasoning (KGR) aims to deduce new facts from existing facts based on mined logic rules underlying knowledge graphs (KGs)
It has been proven to significantly benefit the usage of KGs in many AI applications, such as question answering, recommendation systems, and etc.
arXiv Detail & Related papers (2022-12-12T08:40:04Z) - Scientific Paper Extractive Summarization Enhanced by Citation Graphs [50.19266650000948]
We focus on leveraging citation graphs to improve scientific paper extractive summarization under different settings.
Preliminary results demonstrate that citation graph is helpful even in a simple unsupervised framework.
Motivated by this, we propose a Graph-based Supervised Summarization model (GSS) to achieve more accurate results on the task when large-scale labeled data are available.
arXiv Detail & Related papers (2022-12-08T11:53:12Z) - A Survey of Deep Graph Clustering: Taxonomy, Challenge, Application, and
Open Resource [87.7460720701592]
This paper introduces formulaic definition, evaluation, and development in this field.
The taxonomy of deep graph clustering methods is presented based on four different criteria, including graph type, network architecture, learning paradigm, and clustering method.
The applications of deep graph clustering methods in six domains, including computer vision, natural language processing, recommendation systems, social network analyses, bioinformatics, and medical science, are presented.
arXiv Detail & Related papers (2022-11-23T11:31:11Z) - DIG: A Turnkey Library for Diving into Graph Deep Learning Research [39.58666190541479]
DIG: Dive into Graphs is a research-oriented library that integrates and unified implementations of common graph deep learning algorithms for several advanced tasks.
For each direction, we provide unified implementations of data interfaces, common algorithms, and evaluation metrics.
arXiv Detail & Related papers (2021-03-23T15:05:10Z) - GraphGen: A Scalable Approach to Domain-agnostic Labeled Graph
Generation [5.560715621814096]
Graph generative models have been extensively studied in the data mining literature.
Recent techniques have shifted towards learning this distribution directly from the data.
In this work, we develop a domain-agnostic technique called GraphGen to overcome all of these limitations.
arXiv Detail & Related papers (2020-01-22T18:07:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.