Paper Copilot: Tracking the Evolution of Peer Review in AI Conferences
- URL: http://arxiv.org/abs/2510.13201v1
- Date: Wed, 15 Oct 2025 06:41:06 GMT
- Title: Paper Copilot: Tracking the Evolution of Peer Review in AI Conferences
- Authors: Jing Yang, Qiyao Wei, Jiaxin Pei,
- Abstract summary: We present Paper Copilot, a system that creates durable digital archives of peer reviews across a wide range of computer-science venues.<n>By releasing both the infrastructure and the dataset, Paper Copilot supports reproducible research on the evolution of peer review.
- Score: 10.900405397994687
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid growth of AI conferences is straining an already fragile peer-review system, leading to heavy reviewer workloads, expertise mismatches, inconsistent evaluation standards, superficial or templated reviews, and limited accountability under compressed timelines. In response, conference organizers have introduced new policies and interventions to preserve review standards. Yet these ad-hoc changes often create further concerns and confusion about the review process, leaving how papers are ultimately accepted - and how practices evolve across years - largely opaque. We present Paper Copilot, a system that creates durable digital archives of peer reviews across a wide range of computer-science venues, an open dataset that enables researchers to study peer review at scale, and a large-scale empirical analysis of ICLR reviews spanning multiple years. By releasing both the infrastructure and the dataset, Paper Copilot supports reproducible research on the evolution of peer review. We hope these resources help the community track changes, diagnose failure modes, and inform evidence-based improvements toward a more robust, transparent, and reliable peer-review system.
Related papers
- FMMD: A multimodal open peer review dataset based on F1000Research [2.375184015411392]
FMMD is a multimodal and multidisciplinary open peer review dataset curated from F1000Research.<n>It bridges the current gap by integrating manuscript-level visual and structural data with version-specific reviewer reports and editorial decisions.<n>It provides a comprehensive empirical resource for the development of peer review research.
arXiv Detail & Related papers (2026-02-15T19:36:05Z) - EchoReview: Learning Peer Review from the Echoes of Scientific Citations [48.852960317704486]
EchoReview is a citation-context-driven data synthesis framework.<n>It transforms scientific community's long-term judgments into structured review-style data.<n>It can achieve significant and stable improvements on core review dimensions such as evidence support and review comprehensiveness.
arXiv Detail & Related papers (2026-01-31T13:55:38Z) - Is Peer Review Really in Decline? Analyzing Review Quality across Venues and Time [55.756345497678204]
We introduce a new framework for evidence-based comparative study of review quality.<n>We apply it to major AI and machine learning conferences: ICLR, NeurIPS and *ACL.<n>We study the relationships between measurements of review quality, and its evolution over time.
arXiv Detail & Related papers (2026-01-21T16:48:29Z) - ReviewerToo: Should AI Join The Program Committee? A Look At The Future of Peer Review [23.630458187587223]
ReviewerToo is a framework for studying and deploying AI-assisted peer review.<n>It supports systematic experiments with specialized reviewer personas and structured evaluation criteria.<n>We show how AI can enhance consistency, coverage, and fairness while leaving complex evaluative judgments to domain experts.
arXiv Detail & Related papers (2025-10-09T23:53:19Z) - What Drives Paper Acceptance? A Process-Centric Analysis of Modern Peer Review [2.9282248958475345]
We present a large-scale empirical study of ICLR 2017-2025, encompassing over 28,000 submissions.<n>Our results show that factors beyond scientific novelty significantly shape acceptance outcomes.<n>We propose data-driven guidelines for authors, reviewers, and meta-reviewers to enhance transparency and fairness in peer review.
arXiv Detail & Related papers (2025-09-30T03:00:10Z) - The AI Imperative: Scaling High-Quality Peer Review in Machine Learning [49.87236114682497]
We argue that AI-assisted peer review must become an urgent research and infrastructure priority.<n>We propose specific roles for AI in enhancing factual verification, guiding reviewer performance, assisting authors in quality improvement, and supporting ACs in decision-making.
arXiv Detail & Related papers (2025-06-09T18:37:14Z) - Blended PC Peer Review Model: Process and Reflection [12.91610113966584]
The International Conference on Mining Software Repositories (MSR 2025) experimented with a Blended Program Committee (PC) peer review model for its Technical Track.<n>This paper presents the rationale, implementation, and reflections on the model, including empirical insights from a post-review author survey.<n>Our findings highlight the potential of a Blended PC to alleviate reviewer shortages, foster inclusivity, and sustain a high-quality peer review process.
arXiv Detail & Related papers (2025-04-27T04:45:23Z) - Identifying Aspects in Peer Reviews [61.374437855024844]
We develop a data-driven schema for deriving aspects from a corpus of peer reviews.<n>We introduce a dataset of peer reviews augmented with aspects and show how it can be used for community-level review analysis.
arXiv Detail & Related papers (2025-04-09T14:14:42Z) - AgentReview: Exploring Peer Review Dynamics with LLM Agents [13.826819101545926]
We introduce AgentReview, the first large language model (LLM) based peer review simulation framework.
Our study reveals significant insights, including a notable 37.1% variation in paper decisions due to reviewers' biases.
arXiv Detail & Related papers (2024-06-18T15:22:12Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z) - Ranking Scientific Papers Using Preference Learning [48.78161994501516]
We cast it as a paper ranking problem based on peer review texts and reviewer scores.
We introduce a novel, multi-faceted generic evaluation framework for making final decisions based on peer reviews.
arXiv Detail & Related papers (2021-09-02T19:41:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.