Automated Mining of Leaderboards for Empirical AI Research
- URL: http://arxiv.org/abs/2109.13089v1
- Date: Tue, 31 Aug 2021 10:00:52 GMT
- Title: Automated Mining of Leaderboards for Empirical AI Research
- Authors: Salomon Kabongo, Jennifer D'Souza, and S\"oren Auer
- Abstract summary: This study presents a comprehensive approach for generating Leaderboards for knowledge-graph-based scholarly information organization.
Specifically, we investigate the problem of automated Leaderboard construction using state-of-the-art transformer models, viz. Bert, SciBert, and XLNet.
As a result, a vast share of empirical AI research can be organized in the next-generation digital libraries as knowledge graphs.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the rapid growth of research publications, empowering scientists to keep
oversight over the scientific progress is of paramount importance. In this
regard, the Leaderboards facet of information organization provides an overview
on the state-of-the-art by aggregating empirical results from various studies
addressing the same research challenge. Crowdsourcing efforts like
PapersWithCode among others are devoted to the construction of Leaderboards
predominantly for various subdomains in Artificial Intelligence. Leaderboards
provide machine-readable scholarly knowledge that has proven to be directly
useful for scientists to keep track of research progress. The construction of
Leaderboards could be greatly expedited with automated text mining.
This study presents a comprehensive approach for generating Leaderboards for
knowledge-graph-based scholarly information organization. Specifically, we
investigate the problem of automated Leaderboard construction using
state-of-the-art transformer models, viz. Bert, SciBert, and XLNet. Our
analysis reveals an optimal approach that significantly outperforms existing
baselines for the task with evaluation scores above 90% in F1. This, in turn,
offers new state-of-the-art results for Leaderboard extraction. As a result, a
vast share of empirical AI research can be organized in the next-generation
digital libraries as knowledge graphs.
Related papers
- O1 Replication Journey: A Strategic Progress Report -- Part 1 [52.062216849476776]
This paper introduces a pioneering approach to artificial intelligence research, embodied in our O1 Replication Journey.
Our methodology addresses critical challenges in modern AI research, including the insularity of prolonged team-based projects.
We propose the journey learning paradigm, which encourages models to learn not just shortcuts, but the complete exploration process.
arXiv Detail & Related papers (2024-10-08T15:13:01Z) - Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored.
We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches.
We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z) - Efficient Performance Tracking: Leveraging Large Language Models for Automated Construction of Scientific Leaderboards [67.65408769829524]
Scientific leaderboards are standardized ranking systems that facilitate evaluating and comparing competitive methods.
The exponential increase in publications has made it infeasible to construct and maintain these leaderboards manually.
automatic leaderboard construction has emerged as a solution to reduce manual labor.
arXiv Detail & Related papers (2024-09-19T11:12:27Z) - Generative AI in Evidence-Based Software Engineering: A White Paper [10.489725182789885]
In less than a year practitioners and researchers witnessed a rapid and wide implementation of Generative Artificial Intelligence.
Textual GAIs capabilities enable researchers worldwide to explore new generative scenarios simplifying and hastening all timeconsuming text generation and analysis tasks.
Based on our current investigation we will follow up the vision with the creation and empirical validation of a comprehensive suite of models to effectively support EBSE researchers.
arXiv Detail & Related papers (2024-07-24T17:16:17Z) - MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows [58.56005277371235]
We introduce MASSW, a comprehensive text dataset on Multi-Aspect Summarization of ScientificAspects.
MASSW includes more than 152,000 peer-reviewed publications from 17 leading computer science conferences spanning the past 50 years.
We demonstrate the utility of MASSW through multiple novel machine-learning tasks that can be benchmarked using this new dataset.
arXiv Detail & Related papers (2024-06-10T15:19:09Z) - Autonomous LLM-driven research from data to human-verifiable research papers [0.0]
We build an automation platform that guides interacting through complete stepwise process.
In mode provided annotated data alone, datapaper raised hypotheses, designed plans, wrote and interpreted analysis codes, generated and interpreted results.
We demonstrate potential for AI-driven acceleration of scientific discovery while enhancing traceability, transparency and verifiability.
arXiv Detail & Related papers (2024-04-24T23:15:49Z) - A Bibliographic Study on Artificial Intelligence Research: Global
Panorama and Indian Appearance [2.9895330439073406]
The study reveals that neural networks and deep learning are the major topics included in top AI research publications.
The study also investigates the relative position of Indian researchers in terms of AI research.
arXiv Detail & Related papers (2023-07-04T05:08:36Z) - ORKG-Leaderboards: A Systematic Workflow for Mining Leaderboards as a
Knowledge Graph [0.0]
Orkg-Leaderboard is designed to extract leaderboards from large collections of empirical research papers in Artificial Intelligence (AI)
The system is integrated with the Open Research Knowledge Graph (ORKG) platform, which fosters the machine-actionable publishing of findings.
Our best model performs above 90% F1 on the textitleaderboard extraction task, thus proving Orkg-Leaderboards a practically viable tool for real-world usage.
arXiv Detail & Related papers (2023-05-10T13:19:18Z) - Citation Trajectory Prediction via Publication Influence Representation
Using Temporal Knowledge Graph [52.07771598974385]
Existing approaches mainly rely on mining temporal and graph data from academic articles.
Our framework is composed of three modules: difference-preserved graph embedding, fine-grained influence representation, and learning-based trajectory calculation.
Experiments are conducted on both the APS academic dataset and our contributed AIPatent dataset.
arXiv Detail & Related papers (2022-10-02T07:43:26Z) - Generating Knowledge Graphs by Employing Natural Language Processing and
Machine Learning Techniques within the Scholarly Domain [1.9004296236396943]
We present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications.
Within this research work, we i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools.
We generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the Semantic Web domain.
arXiv Detail & Related papers (2020-10-28T08:31:40Z) - Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research
Dataset: Preliminary Thoughts and Lessons Learned [88.42878484408469]
We present the Neural Covidex, a search engine that exploits the latest neural ranking architectures.
This paper describes our initial efforts and offers a few thoughts about lessons we have learned along the way.
arXiv Detail & Related papers (2020-04-10T17:12:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.