Linked Papers With Code: The Latest in Machine Learning as an RDF
Knowledge Graph
- URL: http://arxiv.org/abs/2310.20475v1
- Date: Tue, 31 Oct 2023 14:09:15 GMT
- Title: Linked Papers With Code: The Latest in Machine Learning as an RDF
Knowledge Graph
- Authors: Michael F\"arber, David Lamprecht
- Abstract summary: We introduce Linked Papers With Code, an RDF knowledge graph that provides comprehensive, current information about almost 400,000 machine learning publications.
Compared to its non-RDF-based counterpart Papers With Code, LPWC translates the latest advancements in machine learning into RDF format.
As a knowledge graph in the Linked Open Data cloud, we offer LPWC in multiple formats from RDF dump files to SPARQL endpoint for direct web queries.
- Score: 1.450405446885067
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we introduce Linked Papers With Code (LPWC), an RDF knowledge
graph that provides comprehensive, current information about almost 400,000
machine learning publications. This includes the tasks addressed, the datasets
utilized, the methods implemented, and the evaluations conducted, along with
their results. Compared to its non-RDF-based counterpart Papers With Code, LPWC
not only translates the latest advancements in machine learning into RDF
format, but also enables novel ways for scientific impact quantification and
scholarly key content recommendation. LPWC is openly accessible at
https://linkedpaperswithcode.com and is licensed under CC-BY-SA 4.0. As a
knowledge graph in the Linked Open Data cloud, we offer LPWC in multiple
formats, from RDF dump files to a SPARQL endpoint for direct web queries, as
well as a data source with resolvable URIs and links to the data sources
SemOpenAlex, Wikidata, and DBLP. Additionally, we supply knowledge graph
embeddings, enabling LPWC to be readily applied in machine learning
applications.
Related papers
- MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs [54.5729817345543]
MOLE is a framework that automatically extracts metadata attributes from scientific papers covering datasets of languages other than Arabic.<n>Our methodology processes entire documents across multiple input formats and incorporates robust validation mechanisms for consistent output.
arXiv Detail & Related papers (2025-05-26T10:31:26Z) - GeAR: Generation Augmented Retrieval [82.20696567697016]
This paper introduces a novel method, $textbfGe$neration.<n>It improves the global document-Query similarity through contrastive learning, but also integrates well-designed fusion and decoding modules.<n>When used as a retriever, GeAR does not incur any additional computational cost over bi-encoders.
arXiv Detail & Related papers (2025-01-06T05:29:00Z) - CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge Graphs in the LLM Era [4.369550829556578]
We introduce CypherBench, the first benchmark with 11 large-scale, multi-domain property graphs with 7.8 million entities and over 10,000 questions.<n>We propose property graph views on top of the underlying RDF graph that can be efficiently queried by LLMs using Cypher.
arXiv Detail & Related papers (2024-12-24T23:22:04Z) - DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models [66.91204604417912]
This study aims to enhance generalizability of small VDU models by distilling knowledge from LLMs.
We present a new framework (called DocKD) that enriches the data generation process by integrating external document knowledge.
Experiments show that DocKD produces high-quality document annotations and surpasses the direct knowledge distillation approach.
arXiv Detail & Related papers (2024-10-04T00:53:32Z) - RDFGraphGen: A Synthetic RDF Graph Generator based on SHACL Constraints [0.0]
This paper introduces RDFGraphGen, a domain-independent generator of synthetic RDF graphs based on SHACL constraints.
The purpose of RDFGraphGen is the generation of small, medium or large RDF knowledge graphs for the purpose of benchmarking, testing, quality control, training and other similar purposes.
arXiv Detail & Related papers (2024-07-25T10:58:50Z) - KnowledgeHub: An end-to-end Tool for Assisted Scientific Discovery [1.6080795642111267]
This paper describes the KnowledgeHub tool, a scientific literature Information Extraction (IE) and Question Answering (QA) pipeline.
This is achieved by supporting the ingestion of PDF documents that are converted to text and structured representations.
A browser-based annotation tool enables annotating the contents of the PDF documents according to the ontology.
A knowledge graph is constructed from these entity and relation triples which can be queried to obtain insights from the data.
arXiv Detail & Related papers (2024-05-16T13:17:14Z) - RAFT: Adapting Language Model to Domain Specific RAG [75.63623523051491]
We present Retrieval Augmented FineTuning (RAFT), a training recipe that improves the model's ability to answer questions in a "openbook" in-domain settings.
RAFT accomplishes this by citing the verbatim right sequence from the relevant document that would help answer the question.
RAFT consistently improves the model's performance across PubMed, HotpotQA, and Gorilla datasets.
arXiv Detail & Related papers (2024-03-15T09:26:02Z) - Relational Deep Learning: Graph Representation Learning on Relational
Databases [69.7008152388055]
We introduce an end-to-end representation approach to learn on data laid out across multiple tables.
Message Passing Graph Neural Networks can then automatically learn across the graph to extract representations that leverage all data input.
arXiv Detail & Related papers (2023-12-07T18:51:41Z) - SemOpenAlex: The Scientific Landscape in 26 Billion RDF Triples [0.0]
SemOpenAlex is an extensive RDF knowledge graph that contains over 26 billion triples about scientific publications and their associated entities.
We offer the data through multiple channels, including RDF dump files, a SPARQL endpoint, and as a data source in the Linked Open Data cloud.
arXiv Detail & Related papers (2023-08-07T15:46:39Z) - Deep learning for table detection and structure recognition: A survey [49.09628624903334]
The goal of this survey is to provide a profound comprehension of the major developments in the field of Table Detection.
We provide an analysis of both classic and new applications in the field.
The datasets and source code of the existing models are organized to provide the reader with a compass on this vast literature.
arXiv Detail & Related papers (2022-11-15T19:42:27Z) - Generate rather than Retrieve: Large Language Models are Strong Context
Generators [74.87021992611672]
We present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators.
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
arXiv Detail & Related papers (2022-09-21T01:30:59Z) - Skip Vectors for RDF Data: Extraction Based on the Complexity of Feature
Patterns [0.0]
The Resource Description Framework (RDF) is a framework for describing metadata, such as attributes and relationships of resources on the Web.
We propose a novel feature vector (called a Skip vector) that represents some features of each resource in an RDF graph by extracting various combinations of neighboring edges and nodes.
The classification tasks can be performed by applying the low-dimensional Skip vector of each resource to conventional machine learning algorithms, such as SVMs, the k-nearest neighbors method, neural networks, random forests, and AdaBoost.
arXiv Detail & Related papers (2022-01-06T10:07:49Z) - Open Domain Question Answering over Virtual Documents: A Unified
Approach for Data and Text [62.489652395307914]
We use the data-to-text method as a means for encoding structured knowledge for knowledge-intensive applications, i.e. open-domain question answering (QA)
Specifically, we propose a verbalizer-retriever-reader framework for open-domain QA over data and text where verbalized tables from Wikipedia and triples from Wikidata are used as augmented knowledge sources.
We show that our Unified Data and Text QA, UDT-QA, can effectively benefit from the expanded knowledge index, leading to large gains over text-only baselines.
arXiv Detail & Related papers (2021-10-16T00:11:21Z) - RDFFrames: Knowledge Graph Access for Machine Learning Tools [6.50725902438059]
Machine learning tools for knowledge graphs do not use SPARQL, despite the obvious advantages of using a database system.
This is due to the mismatch between SPARQL and machine learning tools in terms of data model and programming style.
In this paper, we present RDFFrames, a framework that provides an interface to knowledge graphs from a machine learning software stack.
arXiv Detail & Related papers (2020-02-10T09:39:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.