Author-in-the-Loop Response Generation and Evaluation: Integrating Author Expertise and Intent in Responses to Peer Review
- URL: http://arxiv.org/abs/2602.11173v1
- Date: Mon, 19 Jan 2026 14:07:10 GMT
- Title: Author-in-the-Loop Response Generation and Evaluation: Integrating Author Expertise and Intent in Responses to Peer Review
- Authors: Qian Ruan, Iryna Gurevych,
- Abstract summary: Recent work frames this task as automatic text generation, underusing author expertise and intent.<n>We introduce REspGen, a generation framework that integrates explicit author input, multi-attribute control, and evaluation-guided refinement.<n>To support this formulation, we construct Re$3$Align, the first large-scale dataset of aligned review-response--revision triplets.
- Score: 53.99984738447279
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Author response (rebuttal) writing is a critical stage of scientific peer review that demands substantial author effort. Recent work frames this task as automatic text generation, underusing author expertise and intent. In practice, authors possess domain expertise, author-only information, revision and response strategies--concrete forms of author expertise and intent--to address reviewer concerns, and seek NLP assistance that integrates these signals to support effective response writing in peer review. We reformulate author response generation as an author-in-the-loop task and introduce REspGen, a generation framework that integrates explicit author input, multi-attribute control, and evaluation-guided refinement, together with REspEval, a comprehensive evaluation suite with 20+ metrics covering input utilization, controllability, response quality, and discourse. To support this formulation, we construct Re$^3$Align, the first large-scale dataset of aligned review--response--revision triplets, where revisions provide signals of author expertise and intent. Experiments with state-of-the-art LLMs show the benefits of author input and evaluation-guided refinement, the impact of input design on response quality, and trade-offs between controllability and quality. We make our dataset, generation and evaluation tools publicly available.
Related papers
- Exposía: Academic Writing Assessment of Exposés and Peer Feedback [56.428320613219306]
We present Exposa, the first public dataset that connects writing and feedback assessment in higher education.<n>We use Exposa to benchmark state-of-the-art open-source large language models (LLMs) for two tasks: automated scoring of (1) the proposals and (2) the student reviews.
arXiv Detail & Related papers (2026-01-10T11:33:26Z) - ARISE: Agentic Rubric-Guided Iterative Survey Engine for Automated Scholarly Paper Generation [7.437989615069771]
ARISE is an agentic-guided Iterative Survey Engine for automated generation and continuous refinement of academic survey papers.<n>ARISE employs a modular architecture composed of specialized large language model agents, each mirroring distinct scholarly roles such as topic expansion, citation curation, literature summarization, manuscript drafting, and peer-review-based evaluation.<n>ARISE consistently surpasses baseline methods across metrics of comprehensiveness, accuracy, formatting, and overall scholarly rigor.
arXiv Detail & Related papers (2025-11-21T14:14:35Z) - LiRA: A Multi-Agent Framework for Reliable and Readable Literature Review Generation [66.09346158850308]
We present LiRA (Literature Review Agents), a multi-agent collaborative workflow which emulates the human literature review process.<n>LiRA utilizes specialized agents for content outlining, subsection writing, editing, and reviewing, producing cohesive and comprehensive review articles.<n>We evaluate LiRA in real-world scenarios using document retrieval and assess its robustness to reviewer model variation.
arXiv Detail & Related papers (2025-10-01T12:14:28Z) - Beyond "Not Novel Enough": Enriching Scholarly Critique with LLM-Assisted Feedback [81.0031690510116]
We present a structured approach for automated novelty evaluation that models expert reviewer behavior through three stages.<n>Our method is informed by a large scale analysis of human written novelty reviews.<n> Evaluated on 182 ICLR 2025 submissions, the approach achieves 86.5% alignment with human reasoning and 75.3% agreement on novelty conclusions.
arXiv Detail & Related papers (2025-08-14T16:18:37Z) - AutoRev: Multi-Modal Graph Retrieval for Automated Peer-Review Generation [5.72767946092813]
AutoRev is an automatic peer-review system designed to provide actionable, high-quality feedback to both reviewers and authors.<n>By modelling documents as graphs, AutoRev effectively retrieves the most pertinent information.<n>We envision AutoRev as a powerful tool to streamline the peer-review workflow, alleviating challenges and enabling scalable, high-quality scholarly publishing.
arXiv Detail & Related papers (2025-05-20T13:59:58Z) - Understanding and Supporting Peer Review Using AI-reframed Positive Summary [18.686807993563168]
This study explored the impact of appending an automatically generated positive summary to the peer reviews of a writing task.<n>We found that adding an AI-reframed positive summary to otherwise harsh feedback increased authors' critique acceptance.<n>We discuss the implications of using AI in peer feedback, focusing on how it can influence critique acceptance and support research communities.
arXiv Detail & Related papers (2025-03-13T11:22:12Z) - ReviewEval: An Evaluation Framework for AI-Generated Reviews [9.35023998408983]
The escalating volume of academic research, coupled with a shortage of qualified reviewers, necessitates innovative approaches to peer review.<n>We propose ReviewEval, a comprehensive evaluation framework for AI-generated reviews that measures alignment with human assessments, verifies factual accuracy, assesses analytical depth, identifies degree of constructiveness and adherence to reviewer guidelines.<n>This paper establishes essential metrics for AIbased peer review and substantially enhances the reliability and impact of AI-generated reviews in academic research.
arXiv Detail & Related papers (2025-02-17T12:22:11Z) - Re3: A Holistic Framework and Dataset for Modeling Collaborative Document Revision [62.12545440385489]
We introduce Re3, a framework for joint analysis of collaborative document revision.
We present Re3-Sci, a large corpus of aligned scientific paper revisions manually labeled according to their action and intent.
We use the new data to provide first empirical insights into collaborative document revision in the academic domain.
arXiv Detail & Related papers (2024-05-31T21:19:09Z) - A Literature Review of Literature Reviews in Pattern Analysis and Machine Intelligence [51.26815896167173]
We present a comprehensive tertiary analysis of PAMI reviews along three complementary dimensions.<n>Our analyses reveal distinctive organizational patterns as well as persistent gaps in current review practices.<n>Finally, our evaluation of state-of-the-art AI-generated reviews indicates encouraging advances in coherence and organization.
arXiv Detail & Related papers (2024-02-20T11:28:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.