RDFFrames: Knowledge Graph Access for Machine Learning Tools
- URL: http://arxiv.org/abs/2002.03614v4
- Date: Mon, 6 Sep 2021 04:39:26 GMT
- Title: RDFFrames: Knowledge Graph Access for Machine Learning Tools
- Authors: Aisha Mohamed, Ghadeer Abuoda, Abdurrahman Ghanem, Zoi Kaoudi, Ashraf
Aboulnaga
- Abstract summary: Machine learning tools for knowledge graphs do not use SPARQL, despite the obvious advantages of using a database system.
This is due to the mismatch between SPARQL and machine learning tools in terms of data model and programming style.
In this paper, we present RDFFrames, a framework that provides an interface to knowledge graphs from a machine learning software stack.
- Score: 6.50725902438059
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge graphs represented as RDF datasets are integral to many machine
learning applications. RDF is supported by a rich ecosystem of data management
systems and tools, most notably RDF database systems that provide a SPARQL
query interface. Surprisingly, machine learning tools for knowledge graphs do
not use SPARQL, despite the obvious advantages of using a database system. This
is due to the mismatch between SPARQL and machine learning tools in terms of
data model and programming style. Machine learning tools work on data in
tabular format and process it using an imperative programming style, while
SPARQL is declarative and has as its basic operation matching graph patterns to
RDF triples. We posit that a good interface to knowledge graphs from a machine
learning software stack should use an imperative, navigational programming
paradigm based on graph traversal rather than the SPARQL query paradigm based
on graph patterns. In this paper, we present RDFFrames, a framework that
provides such an interface. RDFFrames provides an imperative Python API that
gets internally translated to SPARQL, and it is integrated with the PyData
machine learning software stack. RDFFrames enables the user to make a sequence
of Python calls to define the data to be extracted from a knowledge graph
stored in an RDF database system, and it translates these calls into a compact
SPQARL query, executes it on the database system, and returns the results in a
standard tabular format. Thus, RDFFrames is a useful tool for data preparation
that combines the usability of PyData with the flexibility and performance of
RDF database systems.
Related papers
- Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models.
Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models.
Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z) - ToolACE: Winning the Points of LLM Function Calling [139.07157814653638]
ToolACE is an automatic agentic pipeline designed to generate accurate, complex, and diverse tool-learning data.
We demonstrate that models trained on our synthesized data, even with only 8B parameters, achieve state-of-the-art performance on the Berkeley Function-Calling Leaderboard.
arXiv Detail & Related papers (2024-09-02T03:19:56Z) - AutoRDF2GML: Facilitating RDF Integration in Graph Machine Learning [9.408189129889006]
AutoRDF2GML is a framework designed to convert RDF data into data representations tailored for graph machine learning tasks.
We present four new benchmark datasets for graph machine learning, created from large RDF knowledge graphs.
arXiv Detail & Related papers (2024-07-26T13:44:06Z) - PyTorch Frame: A Modular Framework for Multi-Modal Tabular Learning [54.912520425218496]
We present PyTorch Frame, a PyTorch-based framework for deep learning over multi-modal tabular data.
We demonstrate the usefulness of PyTorch Frame by implementing diverse models in a modular way.
We integrate PyTorch Frame with PyTorch Geometric, a PyTorch library for Graph Neural Networks (GNNs), to perform end-to-end learning over relational databases.
arXiv Detail & Related papers (2024-03-31T19:15:09Z) - Relational Deep Learning: Graph Representation Learning on Relational
Databases [69.7008152388055]
We introduce an end-to-end representation approach to learn on data laid out across multiple tables.
Message Passing Graph Neural Networks can then automatically learn across the graph to extract representations that leverage all data input.
arXiv Detail & Related papers (2023-12-07T18:51:41Z) - Linked Papers With Code: The Latest in Machine Learning as an RDF
Knowledge Graph [1.450405446885067]
We introduce Linked Papers With Code, an RDF knowledge graph that provides comprehensive, current information about almost 400,000 machine learning publications.
Compared to its non-RDF-based counterpart Papers With Code, LPWC translates the latest advancements in machine learning into RDF format.
As a knowledge graph in the Linked Open Data cloud, we offer LPWC in multiple formats from RDF dump files to SPARQL endpoint for direct web queries.
arXiv Detail & Related papers (2023-10-31T14:09:15Z) - Large Language Models for Automated Data Science: Introducing CAAFE for
Context-Aware Automated Feature Engineering [52.09178018466104]
We introduce Context-Aware Automated Feature Engineering (CAAFE) to generate semantically meaningful features for datasets.
Despite being methodologically simple, CAAFE improves performance on 11 out of 14 datasets.
We highlight the significance of context-aware solutions that can extend the scope of AutoML systems to semantic AutoML.
arXiv Detail & Related papers (2023-05-05T09:58:40Z) - Expressive Reasoning Graph Store: A Unified Framework for Managing RDF
and Property Graph Databases [9.021529689292985]
We present Expressive Reasoning Graph Store (ERGS)
ERGS is a graph store built on top of JanusGraph that also allows storing and querying of RDF datasets.
We describe how RDF data can be translated into a Property Graph representation and then describe a query translation module that converts SPARQL queries into a series of Gremlins.
arXiv Detail & Related papers (2022-09-13T09:07:50Z) - Skip Vectors for RDF Data: Extraction Based on the Complexity of Feature
Patterns [0.0]
The Resource Description Framework (RDF) is a framework for describing metadata, such as attributes and relationships of resources on the Web.
We propose a novel feature vector (called a Skip vector) that represents some features of each resource in an RDF graph by extracting various combinations of neighboring edges and nodes.
The classification tasks can be performed by applying the low-dimensional Skip vector of each resource to conventional machine learning algorithms, such as SVMs, the k-nearest neighbors method, neural networks, random forests, and AdaBoost.
arXiv Detail & Related papers (2022-01-06T10:07:49Z) - A Novel Approach for Generating SPARQL Queries from RDF Graphs [0.0]
This work is done as part of a research master's thesis project.
The goal is to generate SPARQL queries based on user-supplied keywords to query RDF graphs.
arXiv Detail & Related papers (2020-05-30T18:28:49Z) - Multi-layer Optimizations for End-to-End Data Analytics [71.05611866288196]
We introduce Iterative Functional Aggregate Queries (IFAQ), a framework that realizes an alternative approach.
IFAQ treats the feature extraction query and the learning task as one program given in the IFAQ's domain-specific language.
We show that a Scala implementation of IFAQ can outperform mlpack, Scikit, and specialization by several orders of magnitude for linear regression and regression tree models over several relational datasets.
arXiv Detail & Related papers (2020-01-10T16:14:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.