ORKG-Leaderboards: A Systematic Workflow for Mining Leaderboards as a
Knowledge Graph
- URL: http://arxiv.org/abs/2305.11068v1
- Date: Wed, 10 May 2023 13:19:18 GMT
- Title: ORKG-Leaderboards: A Systematic Workflow for Mining Leaderboards as a
Knowledge Graph
- Authors: Salomon Kabongo, Jennifer D'Souza and S\"oren Auer
- Abstract summary: Orkg-Leaderboard is designed to extract leaderboards from large collections of empirical research papers in Artificial Intelligence (AI)
The system is integrated with the Open Research Knowledge Graph (ORKG) platform, which fosters the machine-actionable publishing of findings.
Our best model performs above 90% F1 on the textitleaderboard extraction task, thus proving Orkg-Leaderboards a practically viable tool for real-world usage.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The purpose of this work is to describe the Orkg-Leaderboard software
designed to extract leaderboards defined as Task-Dataset-Metric tuples
automatically from large collections of empirical research papers in Artificial
Intelligence (AI). The software can support both the main workflows of
scholarly publishing, viz. as LaTeX files or as PDF files. Furthermore, the
system is integrated with the Open Research Knowledge Graph (ORKG) platform,
which fosters the machine-actionable publishing of scholarly findings. Thus the
system output, when integrated within the ORKG's supported Semantic Web
infrastructure of representing machine-actionable 'resources' on the Web,
enables: 1) broadly, the integration of empirical results of researchers across
the world, thus enabling transparency in empirical research with the potential
to also being complete contingent on the underlying data source(s) of
publications; and 2) specifically, enables researchers to track the progress in
AI with an overview of the state-of-the-art (SOTA) across the most common AI
tasks and their corresponding datasets via dynamic ORKG frontend views
leveraging tables and visualization charts over the machine-actionable data.
Our best model achieves performances above 90% F1 on the \textit{leaderboard}
extraction task, thus proving Orkg-Leaderboards a practically viable tool for
real-world usage. Going forward, in a sense, Orkg-Leaderboards transforms the
leaderboard extraction task to an automated digitalization task, which has
been, for a long time in the community, a crowdsourced endeavor.
Related papers
- Capturing and Anticipating User Intents in Data Analytics via Knowledge Graphs [0.061446808540639365]
This work explores the usage of Knowledge Graphs (KG) as a basic framework for capturing a human-centered manner complex analytics.
The data stored in the generated KG can then be exploited to provide assistance (e.g., recommendations) to the users interacting with these systems.
arXiv Detail & Related papers (2024-11-01T20:45:23Z) - EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data [15.801018643716437]
This paper aims to enhance the GUI understanding and interacting capabilities of large vision-language models (LVLMs) through a data-driven approach.
We propose EDGE, a general data synthesis framework that automatically generates large-scale, multi-granularity training data from webpages across the Web.
Our approach significantly reduces the dependence on manual annotations, empowering researchers to harness the vast public resources available on the Web to advance their work.
arXiv Detail & Related papers (2024-10-25T10:46:17Z) - Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph [1.7418328181959968]
The proposed research aims to develop an innovative semantic query processing system.
It enables users to obtain comprehensive information about research works produced by Computer Science (CS) researchers at the Australian National University.
arXiv Detail & Related papers (2024-05-24T09:19:45Z) - Text-Augmented Open Knowledge Graph Completion via Pre-Trained Language
Models [53.09723678623779]
We propose TAGREAL to automatically generate quality query prompts and retrieve support information from large text corpora.
The results show that TAGREAL achieves state-of-the-art performance on two benchmark datasets.
We find that TAGREAL has superb performance even with limited training data, outperforming existing embedding-based, graph-based, and PLM-based methods.
arXiv Detail & Related papers (2023-05-24T22:09:35Z) - Scientific Paper Extractive Summarization Enhanced by Citation Graphs [50.19266650000948]
We focus on leveraging citation graphs to improve scientific paper extractive summarization under different settings.
Preliminary results demonstrate that citation graph is helpful even in a simple unsupervised framework.
Motivated by this, we propose a Graph-based Supervised Summarization model (GSS) to achieve more accurate results on the task when large-scale labeled data are available.
arXiv Detail & Related papers (2022-12-08T11:53:12Z) - Deep learning for table detection and structure recognition: A survey [49.09628624903334]
The goal of this survey is to provide a profound comprehension of the major developments in the field of Table Detection.
We provide an analysis of both classic and new applications in the field.
The datasets and source code of the existing models are organized to provide the reader with a compass on this vast literature.
arXiv Detail & Related papers (2022-11-15T19:42:27Z) - MONAI Label: A framework for AI-assisted Interactive Labeling of 3D
Medical Images [49.664220687980006]
The lack of annotated datasets is a major bottleneck for training new task-specific supervised machine learning models.
We present MONAI Label, a free and open-source framework that facilitates the development of applications based on artificial intelligence (AI) models.
arXiv Detail & Related papers (2022-03-23T12:33:11Z) - Automated Graph Machine Learning: Approaches, Libraries, Benchmarks and Directions [58.220137936626315]
This paper extensively discusses automated graph machine learning approaches.
We introduce AutoGL, our dedicated and the world's first open-source library for automated graph machine learning.
Also, we describe a tailored benchmark that supports unified, reproducible, and efficient evaluations.
arXiv Detail & Related papers (2022-01-04T18:31:31Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Automated Mining of Leaderboards for Empirical AI Research [0.0]
This study presents a comprehensive approach for generating Leaderboards for knowledge-graph-based scholarly information organization.
Specifically, we investigate the problem of automated Leaderboard construction using state-of-the-art transformer models, viz. Bert, SciBert, and XLNet.
As a result, a vast share of empirical AI research can be organized in the next-generation digital libraries as knowledge graphs.
arXiv Detail & Related papers (2021-08-31T10:00:52Z) - Cardea: An Open Automated Machine Learning Framework for Electronic
Health Records [11.170152156043336]
Cardea is an open-source automated machine learning framework.
It allows users to build predictive models with their own data.
We demonstrate our framework via 5 prediction tasks on MIMIC-III and Kaggle datasets.
arXiv Detail & Related papers (2020-10-01T15:58:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.