APRES: An Agentic Paper Revision and Evaluation System
- URL: http://arxiv.org/abs/2603.03142v1
- Date: Tue, 03 Mar 2026 16:29:13 GMT
- Title: APRES: An Agentic Paper Revision and Evaluation System
- Authors: Bingchen Zhao, Jenny Zhang, Chenxi Whitehouse, Minqi Jiang, Michael Shvartsman, Abhishek Charnalia, Despoina Magka, Tatiana Shavrina, Derek Dunfield, Oisin Mac Aodha, Yoram Bachrach,
- Abstract summary: The primary way scientists communicate their work and receive feedback from the community is through peer review.<n>In this paper, we introduce a novel method APRES powered by Large Language Models (LLMs) to update a scientific papers text based on an evaluation rubric.<n>Our automated method discovers a rubric that is highly predictive of future citation counts, and integrate it with APRES in an automated system that revises papers to enhance their quality and impact.
- Score: 44.44345338738518
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scientific discoveries must be communicated clearly to realize their full potential. Without effective communication, even the most groundbreaking findings risk being overlooked or misunderstood. The primary way scientists communicate their work and receive feedback from the community is through peer review. However, the current system often provides inconsistent feedback between reviewers, ultimately hindering the improvement of a manuscript and limiting its potential impact. In this paper, we introduce a novel method APRES powered by Large Language Models (LLMs) to update a scientific papers text based on an evaluation rubric. Our automated method discovers a rubric that is highly predictive of future citation counts, and integrate it with APRES in an automated system that revises papers to enhance their quality and impact. Crucially, this objective should be met without altering the core scientific content. We demonstrate the success of APRES, which improves future citation prediction by 19.6% in mean averaged error over the next best baseline, and show that our paper revision process yields papers that are preferred over the originals by human expert evaluators 79% of the time. Our findings provide strong empirical support for using LLMs as a tool to help authors stress-test their manuscripts before submission. Ultimately, our work seeks to augment, not replace, the essential role of human expert reviewers, for it should be humans who discern which discoveries truly matter, guiding science toward advancing knowledge and enriching lives.
Related papers
- Pre-review to Peer review: Pitfalls of Automating Reviews using Large Language Models [1.8349858105838042]
Large Language Models are versatile general-task solvers, and their capabilities can truly assist people with scholarly peer review as textitpre-review agents.<n>While incredibly beneficial, automating academic peer-review, as a concept, raises concerns surrounding safety, research integrity, and the validity of the academic peer-review process.
arXiv Detail & Related papers (2025-12-14T09:56:07Z) - ReviewerToo: Should AI Join The Program Committee? A Look At The Future of Peer Review [23.630458187587223]
ReviewerToo is a framework for studying and deploying AI-assisted peer review.<n>It supports systematic experiments with specialized reviewer personas and structured evaluation criteria.<n>We show how AI can enhance consistency, coverage, and fairness while leaving complex evaluative judgments to domain experts.
arXiv Detail & Related papers (2025-10-09T23:53:19Z) - AI and the Future of Academic Peer Review [0.1622854284766506]
Large language models (LLMs) are being piloted across the peer-review pipeline by journals, funders, and individual reviewers.<n>Early studies suggest that AI assistance can produce reviews comparable in quality to humans.<n>We show that supervised LLM assistance can improve error detection, timeliness, and reviewer workload without displacing human judgment.
arXiv Detail & Related papers (2025-09-17T17:27:12Z) - Automatic Reviewers Fail to Detect Faulty Reasoning in Research Papers: A New Counterfactual Evaluation Framework [55.078301794183496]
We focus on a core reviewing skill that underpins high-quality peer review: detecting faulty research logic.<n>This involves evaluating the internal consistency between a paper's results, interpretations, and claims.<n>We present a fully automated counterfactual evaluation framework that isolates and tests this skill under controlled conditions.
arXiv Detail & Related papers (2025-08-29T08:48:00Z) - The AI Imperative: Scaling High-Quality Peer Review in Machine Learning [49.87236114682497]
We argue that AI-assisted peer review must become an urgent research and infrastructure priority.<n>We propose specific roles for AI in enhancing factual verification, guiding reviewer performance, assisting authors in quality improvement, and supporting ACs in decision-making.
arXiv Detail & Related papers (2025-06-09T18:37:14Z) - XtraGPT: Context-Aware and Controllable Academic Paper Revision [43.263488839387584]
We propose a human-AI collaboration framework for academic paper revision centered on criteria-guided intent alignment and context-aware modeling.<n>We instantiate the framework in XtraGPT, the first suite of open-source LLMs for context-aware, instruction-guided writing assistance.
arXiv Detail & Related papers (2025-05-16T15:02:19Z) - Are We There Yet? Revealing the Risks of Utilizing Large Language Models in Scholarly Peer Review [66.73247554182376]
Large language models (LLMs) have led to their integration into peer review.<n>The unchecked adoption of LLMs poses significant risks to the integrity of the peer review system.<n>We show that manipulating 5% of the reviews could potentially cause 12% of the papers to lose their position in the top 30% rankings.
arXiv Detail & Related papers (2024-12-02T16:55:03Z) - ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models [56.08917291606421]
ResearchAgent is an AI-based system for ideation and operationalization of novel work.<n>ResearchAgent automatically defines novel problems, proposes methods and designs experiments, while iteratively refining them.<n>We experimentally validate our ResearchAgent on scientific publications across multiple disciplines.
arXiv Detail & Related papers (2024-04-11T13:36:29Z) - Ranking Scientific Papers Using Preference Learning [48.78161994501516]
We cast it as a paper ranking problem based on peer review texts and reviewer scores.
We introduce a novel, multi-faceted generic evaluation framework for making final decisions based on peer reviews.
arXiv Detail & Related papers (2021-09-02T19:41:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.