AutoRDF2GML: Facilitating RDF Integration in Graph Machine Learning
- URL: http://arxiv.org/abs/2407.18735v1
- Date: Fri, 26 Jul 2024 13:44:06 GMT
- Title: AutoRDF2GML: Facilitating RDF Integration in Graph Machine Learning
- Authors: Michael Färber, David Lamprecht, Yuni Susanti,
- Abstract summary: AutoRDF2GML is a framework designed to convert RDF data into data representations tailored for graph machine learning tasks.
We present four new benchmark datasets for graph machine learning, created from large RDF knowledge graphs.
- Score: 9.408189129889006
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we introduce AutoRDF2GML, a framework designed to convert RDF data into data representations tailored for graph machine learning tasks. AutoRDF2GML enables, for the first time, the creation of both content-based features -- i.e., features based on RDF datatype properties -- and topology-based features -- i.e., features based on RDF object properties. Characterized by automated feature extraction, AutoRDF2GML makes it possible even for users less familiar with RDF and SPARQL to generate data representations ready for graph machine learning tasks, such as link prediction, node classification, and graph classification. Furthermore, we present four new benchmark datasets for graph machine learning, created from large RDF knowledge graphs using our framework. These datasets serve as valuable resources for evaluating graph machine learning approaches, such as graph neural networks. Overall, our framework effectively bridges the gap between the Graph Machine Learning and Semantic Web communities, paving the way for RDF-based machine learning applications.
Related papers
- RGL: A Graph-Centric, Modular Framework for Efficient Retrieval-Augmented Generation on Graphs [58.10503898336799]
We introduce the RAG-on-Graphs Library (RGL), a modular framework that seamlessly integrates the complete RAG pipeline.
RGL addresses key challenges by supporting a variety of graph formats and integrating optimized implementations for essential components.
Our evaluations demonstrate that RGL not only accelerates the prototyping process but also enhances the performance and applicability of graph-based RAG systems.
arXiv Detail & Related papers (2025-03-25T03:21:48Z) - RAGraph: A General Retrieval-Augmented Graph Learning Framework [35.25522856244149]
We introduce a novel framework called General Retrieval-Augmented Graph Learning (RAGraph)
RAGraph brings external graph data into the general graph foundation model to improve model generalization on unseen scenarios.
During inference, the RAGraph adeptly retrieves similar toy graphs based on key similarities in downstream tasks.
arXiv Detail & Related papers (2024-10-31T12:05:21Z) - RDFGraphGen: An RDF Graph Generator based on SHACL Shapes [2.7213277957181328]
We propose RDFGraphGen, an open-source RDF graph generator that uses characteristics provided in the form of SHACL shapes to generate synthetic RDF graphs.<n>RDFGraphGen is domain-agnostic, with graph structure, value constraints, and distributions.<n>Our results show that RDFGraphGen is scalable and can generate small, medium, and large RDF graphs in any domain.
arXiv Detail & Related papers (2024-07-25T10:58:50Z) - A Pure Transformer Pretraining Framework on Text-attributed Graphs [50.833130854272774]
We introduce a feature-centric pretraining perspective by treating graph structure as a prior.
Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks.
GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
arXiv Detail & Related papers (2024-06-19T22:30:08Z) - Large Language Models for Automated Data Science: Introducing CAAFE for
Context-Aware Automated Feature Engineering [52.09178018466104]
We introduce Context-Aware Automated Feature Engineering (CAAFE) to generate semantically meaningful features for datasets.
Despite being methodologically simple, CAAFE improves performance on 11 out of 14 datasets.
We highlight the significance of context-aware solutions that can extend the scope of AutoML systems to semantic AutoML.
arXiv Detail & Related papers (2023-05-05T09:58:40Z) - Skip Vectors for RDF Data: Extraction Based on the Complexity of Feature
Patterns [0.0]
The Resource Description Framework (RDF) is a framework for describing metadata, such as attributes and relationships of resources on the Web.
We propose a novel feature vector (called a Skip vector) that represents some features of each resource in an RDF graph by extracting various combinations of neighboring edges and nodes.
The classification tasks can be performed by applying the low-dimensional Skip vector of each resource to conventional machine learning algorithms, such as SVMs, the k-nearest neighbors method, neural networks, random forests, and AdaBoost.
arXiv Detail & Related papers (2022-01-06T10:07:49Z) - Automated Graph Machine Learning: Approaches, Libraries, Benchmarks and Directions [58.220137936626315]
This paper extensively discusses automated graph machine learning approaches.
We introduce AutoGL, our dedicated and the world's first open-source library for automated graph machine learning.
Also, we describe a tailored benchmark that supports unified, reproducible, and efficient evaluations.
arXiv Detail & Related papers (2022-01-04T18:31:31Z) - A Scalable AutoML Approach Based on Graph Neural Networks [4.723269144709768]
KGpip is designed as a sub-component for AutoML systems.
We demonstrate this ability via integrating KGpip with two AutoML systems and show that it does significantly enhance the performance of existing state-of-the-art systems.
arXiv Detail & Related papers (2021-10-29T20:55:13Z) - AutoGL: A Library for Automated Graph Learning [67.63587865669372]
We present Automated Graph Learning (AutoGL), the first dedicated library for automated machine learning on graphs.
AutoGL is open-source, easy to use, and flexible to be extended.
We also present AutoGL-light, a lightweight version of AutoGL to facilitate customizing pipelines and enriching applications.
arXiv Detail & Related papers (2021-04-11T10:49:23Z) - Automated Machine Learning on Graphs: A Survey [81.21692888288658]
This paper is the first systematic and comprehensive review of automated machine learning on graphs.
We focus on hyper- parameter optimization (HPO) and neural architecture search (NAS) for graph machine learning.
In the end, we share our insights on future research directions for automated graph machine learning.
arXiv Detail & Related papers (2021-03-01T04:20:33Z) - Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text
Generation [56.73834525802723]
Lightweight Dynamic Graph Convolutional Networks (LDGCNs) are proposed.
LDGCNs capture richer non-local interactions by synthesizing higher order information from the input graphs.
We develop two novel parameter saving strategies based on the group graph convolutions and weight tied convolutions to reduce memory usage and model complexity.
arXiv Detail & Related papers (2020-10-09T06:03:46Z) - A Novel Approach for Generating SPARQL Queries from RDF Graphs [0.0]
This work is done as part of a research master's thesis project.
The goal is to generate SPARQL queries based on user-supplied keywords to query RDF graphs.
arXiv Detail & Related papers (2020-05-30T18:28:49Z) - RDFFrames: Knowledge Graph Access for Machine Learning Tools [6.50725902438059]
Machine learning tools for knowledge graphs do not use SPARQL, despite the obvious advantages of using a database system.
This is due to the mismatch between SPARQL and machine learning tools in terms of data model and programming style.
In this paper, we present RDFFrames, a framework that provides an interface to knowledge graphs from a machine learning software stack.
arXiv Detail & Related papers (2020-02-10T09:39:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.