Related papers: FRONTIER-RevRec: A Large-scale Dataset for Reviewer Recommendation

FRONTIER-RevRec: A Large-scale Dataset for Reviewer Recommendation

URL: http://arxiv.org/abs/2510.16597v1
Date: Sat, 18 Oct 2025 17:52:38 GMT
Title: FRONTIER-RevRec: A Large-scale Dataset for Reviewer Recommendation
Authors: Qiyao Peng, Chen Wang, Yinghui Wang, Hongtao Liu, Xuan Guo, Wenjun Wang,
Abstract summary: FRONTIER-RevRec is a large-scale dataset constructed from authentic peer review records (2007-2025) from the Frontiers open-access publishing platform.<n>The dataset contains 177941 distinct reviewers and 478379 papers across 209 journals spanning multiple disciplines.
Score: 15.014715782460192
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reviewer recommendation is a critical task for enhancing the efficiency of academic publishing workflows. However, research in this area has been persistently hindered by the lack of high-quality benchmark datasets, which are often limited in scale, disciplinary scope, and comparative analyses of different methodologies. To address this gap, we introduce FRONTIER-RevRec, a large-scale dataset constructed from authentic peer review records (2007-2025) from the Frontiers open-access publishing platform https://www.frontiersin.org/. The dataset contains 177941 distinct reviewers and 478379 papers across 209 journals spanning multiple disciplines including clinical medicine, biology, psychology, engineering, and social sciences. Our comprehensive evaluation on this dataset reveals that content-based methods significantly outperform collaborative filtering. This finding is explained by our structural analysis, which uncovers fundamental differences between academic recommendation and commercial domains. Notably, approaches leveraging language models are particularly effective at capturing the semantic alignment between a paper's content and a reviewer's expertise. Furthermore, our experiments identify optimal aggregation strategies to enhance the recommendation pipeline. FRONTIER-RevRec is intended to serve as a comprehensive benchmark to advance research in reviewer recommendation and facilitate the development of more effective academic peer review systems. The FRONTIER-RevRec dataset is available at: https://anonymous.4open.science/r/FRONTIER-RevRec-5D05.

Related papers

OmniReview: A Large-scale Benchmark and LLM-enhanced Framework for Realistic Reviewer Recommendation [22.223973340236594]
Profiling Scholars with Multi-gate Mixture-of-Experts (Pro-MMoE) is a novel framework that synergizes Large Language Models (LLMs) with Multi-task Learning.<n>Pro-MMoE achieves state-of-the-art performance across six of seven metrics, establishing a new benchmark for realistic reviewer recommendation.
arXiv Detail & Related papers (2026-02-09T16:57:35Z)
AutoRev: Multi-Modal Graph Retrieval for Automated Peer-Review Generation [5.72767946092813]
AutoRev is an automatic peer-review system designed to provide actionable, high-quality feedback to both reviewers and authors.<n>By modelling documents as graphs, AutoRev effectively retrieves the most pertinent information.<n>We envision AutoRev as a powerful tool to streamline the peer-review workflow, alleviating challenges and enabling scalable, high-quality scholarly publishing.
arXiv Detail & Related papers (2025-05-20T13:59:58Z)
exHarmony: Authorship and Citations for Benchmarking the Reviewer Assignment Problem [11.763640675057076]
We develop a benchmark dataset for evaluating the reviewer assignment problem without needing explicit labels.<n>We benchmark various methods, including traditional lexical matching, static neural embeddings, and contextualized neural embeddings.<n>Our results indicate that while traditional methods perform reasonably well, contextualized embeddings trained on scholarly literature show the best performance.
arXiv Detail & Related papers (2025-02-11T16:35:04Z)
The State of Data Curation at NeurIPS: An Assessment of Dataset Development Practices in the Datasets and Benchmarks Track [1.5993707490601146]
This work provides an analysis of dataset development practices at NeurIPS through the lens of data curation.<n>We present an evaluation framework for dataset documentation, consisting of a rubric and toolkit.<n>Results indicate greater need for documentation about environmental footprint, ethical considerations, and data management.
arXiv Detail & Related papers (2024-10-29T19:07:50Z)
Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored. We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches. We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z)
STRICTA: Structured Reasoning in Critical Text Assessment for Peer Review and Beyond [68.47402386668846]
We introduce Structured Reasoning In Critical Text Assessment (STRICTA) to model text assessment as an explicit, step-wise reasoning process.<n>STRICTA breaks down the assessment into a graph of interconnected reasoning steps drawing on causality theory.<n>We apply STRICTA to a dataset of over 4000 reasoning steps from roughly 40 biomedical experts on more than 20 papers.
arXiv Detail & Related papers (2024-09-09T06:55:37Z)
Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions [62.0123588983514]
Large Language Models (LLMs) have demonstrated wide-ranging applications across various fields. We reformulate the peer-review process as a multi-turn, long-context dialogue, incorporating distinct roles for authors, reviewers, and decision makers. We construct a comprehensive dataset containing over 26,841 papers with 92,017 reviews collected from multiple sources.
arXiv Detail & Related papers (2024-06-09T08:24:17Z)
Investigating Fairness Disparities in Peer Review: A Language Model Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs) We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date. We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z)
Tag-Aware Document Representation for Research Paper Recommendation [68.8204255655161]
We propose a hybrid approach that leverages deep semantic representation of research papers based on social tags assigned by users. The proposed model is effective in recommending research papers even when the rating data is very sparse.
arXiv Detail & Related papers (2022-09-08T09:13:07Z)
Reenvisioning Collaborative Filtering vs Matrix Factorization [65.74881520196762]
Collaborative filtering models based on matrix factorization and learned similarities using Artificial Neural Networks (ANNs) have gained significant attention in recent years. Announcement of ANNs within the recommendation ecosystem has been recently questioned, raising several comparisons in terms of efficiency and effectiveness. We show the potential these techniques may have on beyond-accuracy evaluation while analyzing effect on complementary evaluation dimensions.
arXiv Detail & Related papers (2021-07-28T16:29:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.