Related papers: EchoReview: Learning Peer Review from the Echoes of Scientific Citations

EchoReview: Learning Peer Review from the Echoes of Scientific Citations

URL: http://arxiv.org/abs/2602.00733v1
Date: Sat, 31 Jan 2026 13:55:38 GMT
Title: EchoReview: Learning Peer Review from the Echoes of Scientific Citations
Authors: Yinuo Zhang, Dingcheng Huang, Haifeng Suo, Yizhuo Li, Ziya Zhao, Junhao Xu, Zhiying Tu, Dianhui Chu, Deming Zhai, Xianming Liu, Xiaoyan Yu, Dianbo Sui,
Abstract summary: EchoReview is a citation-context-driven data synthesis framework.<n>It transforms scientific community's long-term judgments into structured review-style data.<n>It can achieve significant and stable improvements on core review dimensions such as evidence support and review comprehensiveness.
Score: 48.852960317704486
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As the volume of scientific submissions continues to grow rapidly, traditional peer review systems are facing unprecedented scalability pressures, highlighting the urgent need for automated reviewing methods that are both scalable and reliable. Existing supervised fine-tuning approaches based on real review data are fundamentally constrained by single-source of data as well as the inherent subjectivity and inconsistency of human reviews, limiting their ability to support high-quality automated reviewers. To address these issues, we propose EchoReview, a citation-context-driven data synthesis framework that systematically mines implicit collective evaluative signals from academic citations and transforms scientific community's long-term judgments into structured review-style data. Based on this pipeline, we construct EchoReview-16K, the first large-scale, cross-conference, and cross-year citation-driven review dataset, and train an automated reviewer, EchoReviewer-7B. Experimental results demonstrate that EchoReviewer-7B can achieve significant and stable improvements on core review dimensions such as evidence support and review comprehensiveness, validating citation context as a robust and effective data paradigm for reliable automated peer review.

Related papers

ReviewGuard: Enhancing Deficient Peer Review Detection via LLM-Driven Data Augmentation [3.9199635838637072]
ReviewGuard is an automated system for detecting and categorizing deficient reviews.<n>It produces a final corpus of 6,634 papers, 24,657 real reviews, and 46,438 synthetic reviews.<n> deficient reviews demonstrate lower rating scores, higher self-reported confidence, reduced structural complexity, and a higher proportion of negative sentiment.
arXiv Detail & Related papers (2025-10-18T15:45:26Z)
What Drives Paper Acceptance? A Process-Centric Analysis of Modern Peer Review [2.9282248958475345]
We present a large-scale empirical study of ICLR 2017-2025, encompassing over 28,000 submissions.<n>Our results show that factors beyond scientific novelty significantly shape acceptance outcomes.<n>We propose data-driven guidelines for authors, reviewers, and meta-reviewers to enhance transparency and fairness in peer review.
arXiv Detail & Related papers (2025-09-30T03:00:10Z)
Automatic Reviewers Fail to Detect Faulty Reasoning in Research Papers: A New Counterfactual Evaluation Framework [55.078301794183496]
We focus on a core reviewing skill that underpins high-quality peer review: detecting faulty research logic.<n>This involves evaluating the internal consistency between a paper's results, interpretations, and claims.<n>We present a fully automated counterfactual evaluation framework that isolates and tests this skill under controlled conditions.
arXiv Detail & Related papers (2025-08-29T08:48:00Z)
Re$^2$: A Consistency-ensured Dataset for Full-stage Peer Review and Multi-turn Rebuttal Discussions [2.5226834810382113]
We introduce the largest consistency-ensured peer review and rebuttal dataset named Re2.<n>This dataset comprises 19,926 initial submissions, 70,668 review comments, and 53,818 rebuttals from 24 conferences and 21 workshops on OpenReview.
arXiv Detail & Related papers (2025-05-12T16:02:52Z)
LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews [74.87393214734114]
This work introduces LazyReview, a dataset of peer-review sentences annotated with fine-grained lazy thinking categories.<n>Large Language Models (LLMs) struggle to detect these instances in a zero-shot setting.<n> instruction-based fine-tuning on our dataset significantly boosts performance by 10-20 performance points.
arXiv Detail & Related papers (2025-04-15T10:07:33Z)
Paper Quality Assessment based on Individual Wisdom Metrics from Open Peer Review [4.35783648216893]
Traditional closed peer review systems are slow, costly, non-transparent, and possibly subject to biases.<n>We propose and examine the efficacy and accuracy of an alternative form of scientific peer review: through an open, bottom-up process.
arXiv Detail & Related papers (2025-01-22T17:00:27Z)
GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews [25.291384842659397]
We introduce sys, a summarization method designed to offer a concise yet comprehensive overview of scholarly reviews. Unlike traditional consensus-based methods, sys extracts both common and unique opinions from the reviews.
arXiv Detail & Related papers (2024-06-11T15:27:01Z)
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation [87.44350003888646]
Eval-Instruct can acquire pointwise grading critiques with pseudo references and revise these critiques via multi-path prompting. CritiqueLLM is empirically shown to outperform ChatGPT and all the open-source baselines.
arXiv Detail & Related papers (2023-11-30T16:52:42Z)
Ranking Scientific Papers Using Preference Learning [48.78161994501516]
We cast it as a paper ranking problem based on peer review texts and reviewer scores. We introduce a novel, multi-faceted generic evaluation framework for making final decisions based on peer reviews.
arXiv Detail & Related papers (2021-09-02T19:41:47Z)
Hierarchical Bi-Directional Self-Attention Networks for Paper Review Rating Recommendation [81.55533657694016]
We propose a Hierarchical bi-directional self-attention Network framework (HabNet) for paper review rating prediction and recommendation. Specifically, we leverage the hierarchical structure of the paper reviews with three levels of encoders: sentence encoder (level one), intra-review encoder (level two) and inter-review encoder (level three) We are able to identify useful predictors to make the final acceptance decision, as well as to help discover the inconsistency between numerical review ratings and text sentiment conveyed by reviewers.
arXiv Detail & Related papers (2020-11-02T08:07:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.