Related papers: Automated Mining of Leaderboards for Empirical AI Research

Automated Mining of Leaderboards for Empirical AI Research

URL: http://arxiv.org/abs/2109.13089v1
Date: Tue, 31 Aug 2021 10:00:52 GMT
Title: Automated Mining of Leaderboards for Empirical AI Research
Authors: Salomon Kabongo, Jennifer D'Souza, and S\"oren Auer
Abstract summary: This study presents a comprehensive approach for generating Leaderboards for knowledge-graph-based scholarly information organization. Specifically, we investigate the problem of automated Leaderboard construction using state-of-the-art transformer models, viz. Bert, SciBert, and XLNet. As a result, a vast share of empirical AI research can be organized in the next-generation digital libraries as knowledge graphs.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the rapid growth of research publications, empowering scientists to keep oversight over the scientific progress is of paramount importance. In this regard, the Leaderboards facet of information organization provides an overview on the state-of-the-art by aggregating empirical results from various studies addressing the same research challenge. Crowdsourcing efforts like PapersWithCode among others are devoted to the construction of Leaderboards predominantly for various subdomains in Artificial Intelligence. Leaderboards provide machine-readable scholarly knowledge that has proven to be directly useful for scientists to keep track of research progress. The construction of Leaderboards could be greatly expedited with automated text mining. This study presents a comprehensive approach for generating Leaderboards for knowledge-graph-based scholarly information organization. Specifically, we investigate the problem of automated Leaderboard construction using state-of-the-art transformer models, viz. Bert, SciBert, and XLNet. Our analysis reveals an optimal approach that significantly outperforms existing baselines for the task with evaluation scores above 90% in F1. This, in turn, offers new state-of-the-art results for Leaderboard extraction. As a result, a vast share of empirical AI research can be organized in the next-generation digital libraries as knowledge graphs.

Related papers

AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents [49.67355440164857]
We introduce AIRS-Bench, a suite of 20 tasks sourced from state-of-the-art machine learning papers.<n>Airs-Bench tasks assess agentic capabilities over the full research lifecycle.<n>We open-source the AIRS-Bench task definitions and evaluation code to catalyze further development in autonomous scientific research.
arXiv Detail & Related papers (2026-02-06T16:45:02Z)
Real Deep Research for AI, Robotics and Beyond [85.87181330763548]
We present Real Deep Research (RDR) a comprehensive framework applied to the domains of AI and robotics.<n>The main paper details the construction of the RDR pipeline, while the appendix provides extensive results across each analyzed topic.
arXiv Detail & Related papers (2025-10-23T17:59:05Z)
The Budget AI Researcher and the Power of RAG Chains [4.797627592793464]
Current approaches to supporting research idea generation often rely on generic large language models (LLMs)<n>Our framework, The Budget AI Researcher, uses retrieval-augmented generation chains, vector databases, and topic-guided pairing to recombine concepts from hundreds of machine learning papers.<n>The system ingests papers from nine major AI conferences, which collectively span the vast subfields of machine learning, and organizes them into a hierarchical topic tree.
arXiv Detail & Related papers (2025-06-14T02:40:35Z)
A Position Paper on the Automatic Generation of Machine Learning Leaderboards [12.736094044510224]
An important task in machine learning (ML) research is comparing prior work, which is often performed via ML leaderboards.<n>To ease this burden, researchers have developed methods to extract leaderboard entries from research papers.<n>Yet, prior work varies in problem framing, complicating comparisons and limiting real-world applicability.<n>We propose an ALG unified conceptual framework to standardise how the ALG task is defined.
arXiv Detail & Related papers (2025-05-23T04:46:10Z)
From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery [67.07598263346591]
Large Language Models (LLMs) are catalyzing a paradigm shift in scientific discovery.<n>This survey systematically charts this burgeoning field, placing a central focus on the changing roles and escalating capabilities of LLMs in science.
arXiv Detail & Related papers (2025-05-19T15:41:32Z)
Graph of AI Ideas: Leveraging Knowledge Graphs and LLMs for AI Research Idea Generation [25.04071920426971]
We propose a framework called the Graph of AI Ideas (GoAI) for the AI research field, which is dominated by open-access papers. This framework organizes relevant literature into entities within a knowledge graph and summarizes the semantic information contained in citations into relations within the graph.
arXiv Detail & Related papers (2025-03-11T15:36:38Z)
CS-PaperSum: A Large-Scale Dataset of AI-Generated Summaries for Scientific Papers [3.929864777332447]
CS-PaperSum is a large-scale dataset of 91,919 papers from 31 top-tier computer science conferences. Our dataset enables automated literature analysis, research trend forecasting, and AI-driven scientific discovery.
arXiv Detail & Related papers (2025-02-27T22:48:35Z)
LAG: LLM agents for Leaderboard Auto Generation on Demanding [38.53050861010012]
Leaderboard Auto Generation (LAG) is a framework for automatic generation of leaderboards on a given research topic. faced with a large number of AI papers updated daily, it becomes difficult for researchers to track every paper's proposed methods, experimental results, and settings. Our contributions include a comprehensive solution to the leaderboard construction problem, a reliable evaluation method, and experimental results showing the high quality of leaderboards.
arXiv Detail & Related papers (2025-02-25T13:54:03Z)
Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation [58.064940977804596]
A plethora of new AI models and tools has been proposed, promising to empower researchers and academics worldwide to conduct their research more effectively and efficiently. Ethical concerns regarding shortcomings of these tools and potential for misuse take a particularly prominent place in our discussion.
arXiv Detail & Related papers (2025-02-07T18:26:45Z)
O1 Replication Journey: A Strategic Progress Report -- Part 1 [52.062216849476776]
This paper introduces a pioneering approach to artificial intelligence research, embodied in our O1 Replication Journey. Our methodology addresses critical challenges in modern AI research, including the insularity of prolonged team-based projects. We propose the journey learning paradigm, which encourages models to learn not just shortcuts, but the complete exploration process.
arXiv Detail & Related papers (2024-10-08T15:13:01Z)
Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored. We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches. We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z)
Efficient Performance Tracking: Leveraging Large Language Models for Automated Construction of Scientific Leaderboards [67.65408769829524]
Scientific leaderboards are standardized ranking systems that facilitate evaluating and comparing competitive methods. The exponential increase in publications has made it infeasible to construct and maintain these leaderboards manually. automatic leaderboard construction has emerged as a solution to reduce manual labor.
arXiv Detail & Related papers (2024-09-19T11:12:27Z)
Generative AI in Evidence-Based Software Engineering: A White Paper [10.489725182789885]
In less than a year practitioners and researchers witnessed a rapid and wide implementation of Generative Artificial Intelligence. Textual GAIs capabilities enable researchers worldwide to explore new generative scenarios simplifying and hastening all timeconsuming text generation and analysis tasks. Based on our current investigation we will follow up the vision with the creation and empirical validation of a comprehensive suite of models to effectively support EBSE researchers.
arXiv Detail & Related papers (2024-07-24T17:16:17Z)
MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows [58.56005277371235]
We introduce MASSW, a comprehensive text dataset on Multi-Aspect Summarization of ScientificAspects. MASSW includes more than 152,000 peer-reviewed publications from 17 leading computer science conferences spanning the past 50 years. We demonstrate the utility of MASSW through multiple novel machine-learning tasks that can be benchmarked using this new dataset.
arXiv Detail & Related papers (2024-06-10T15:19:09Z)
Autonomous LLM-driven research from data to human-verifiable research papers [0.0]
We build an automation platform that guides interacting through complete stepwise process. In mode provided annotated data alone, datapaper raised hypotheses, designed plans, wrote and interpreted analysis codes, generated and interpreted results. We demonstrate potential for AI-driven acceleration of scientific discovery while enhancing traceability, transparency and verifiability.
arXiv Detail & Related papers (2024-04-24T23:15:49Z)
A Bibliographic Study on Artificial Intelligence Research: Global Panorama and Indian Appearance [2.9895330439073406]
The study reveals that neural networks and deep learning are the major topics included in top AI research publications. The study also investigates the relative position of Indian researchers in terms of AI research.
arXiv Detail & Related papers (2023-07-04T05:08:36Z)
ORKG-Leaderboards: A Systematic Workflow for Mining Leaderboards as a Knowledge Graph [0.0]
Orkg-Leaderboard is designed to extract leaderboards from large collections of empirical research papers in Artificial Intelligence (AI) The system is integrated with the Open Research Knowledge Graph (ORKG) platform, which fosters the machine-actionable publishing of findings. Our best model performs above 90% F1 on the textitleaderboard extraction task, thus proving Orkg-Leaderboards a practically viable tool for real-world usage.
arXiv Detail & Related papers (2023-05-10T13:19:18Z)
Citation Trajectory Prediction via Publication Influence Representation Using Temporal Knowledge Graph [52.07771598974385]
Existing approaches mainly rely on mining temporal and graph data from academic articles. Our framework is composed of three modules: difference-preserved graph embedding, fine-grained influence representation, and learning-based trajectory calculation. Experiments are conducted on both the APS academic dataset and our contributed AIPatent dataset.
arXiv Detail & Related papers (2022-10-02T07:43:26Z)
Generating Knowledge Graphs by Employing Natural Language Processing and Machine Learning Techniques within the Scholarly Domain [1.9004296236396943]
We present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications. Within this research work, we i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools. We generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the Semantic Web domain.
arXiv Detail & Related papers (2020-10-28T08:31:40Z)
Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned [88.42878484408469]
We present the Neural Covidex, a search engine that exploits the latest neural ranking architectures. This paper describes our initial efforts and offers a few thoughts about lessons we have learned along the way.
arXiv Detail & Related papers (2020-04-10T17:12:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.