Paper Quality Assessment based on Individual Wisdom Metrics from Open Peer Review
- URL: http://arxiv.org/abs/2501.13014v1
- Date: Wed, 22 Jan 2025 17:00:27 GMT
- Title: Paper Quality Assessment based on Individual Wisdom Metrics from Open Peer Review
- Authors: Andrii Zahorodnii, Jasper J. F. van den Bosch, Ian Charest, Christopher Summerfield, Ila R. Fiete,
- Abstract summary: This study proposes a data-driven framework for enhancing the accuracy and efficiency of scientific peer review through an open, bottom-up process that estimates reviewer quality.<n>We analyze open peer review data from two major scientific conferences, and demonstrate that reviewer-specific quality scores significantly improve the reliability of paper quality estimation.
- Score: 3.802113616844045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study proposes a data-driven framework for enhancing the accuracy and efficiency of scientific peer review through an open, bottom-up process that estimates reviewer quality. Traditional closed peer review systems, while essential for quality control, are often slow, costly, and subject to biases that can impede scientific progress. Here, we introduce a method that evaluates individual reviewer reliability by quantifying agreement with community consensus scores and applying Bayesian weighting to refine paper quality assessments. We analyze open peer review data from two major scientific conferences, and demonstrate that reviewer-specific quality scores significantly improve the reliability of paper quality estimation. Perhaps surprisingly, we find that reviewer quality scores are unrelated to authorship quality. Our model incorporates incentive structures to recognize high-quality reviewers and encourage broader coverage of submitted papers, thereby mitigating the common "rich-get-richer" pitfall of social media. These findings suggest that open peer review, with mechanisms for estimating and incentivizing reviewer quality, offers a scalable and equitable alternative for scientific publishing, with potential to enhance the speed, fairness, and transparency of the peer review process.
Related papers
- Identifying Aspects in Peer Reviews [61.374437855024844]
We develop a data-driven schema for deriving fine-grained aspects from a corpus of peer reviews.
We introduce a dataset of peer reviews augmented with aspects and show how it can be used for community-level review analysis.
arXiv Detail & Related papers (2025-04-09T14:14:42Z) - exHarmony: Authorship and Citations for Benchmarking the Reviewer Assignment Problem [11.763640675057076]
We develop a benchmark dataset for evaluating the reviewer assignment problem without needing explicit labels.
We benchmark various methods, including traditional lexical matching, static neural embeddings, and contextualized neural embeddings.
Our results indicate that while traditional methods perform reasonably well, contextualized embeddings trained on scholarly literature show the best performance.
arXiv Detail & Related papers (2025-02-11T16:35:04Z) - Generative Adversarial Reviews: When LLMs Become the Critic [1.2430809884830318]
We introduce Generative Agent Reviewers (GAR), leveraging LLM-empowered agents to simulate faithful peer reviewers.<n>Central to this approach is a graph-based representation of manuscripts, condensing content and logically organizing information.<n>Our experiments demonstrate that GAR performs comparably to human reviewers in providing detailed feedback and predicting paper outcomes.
arXiv Detail & Related papers (2024-12-09T06:58:17Z) - Multi-Facet Counterfactual Learning for Content Quality Evaluation [48.73583736357489]
We propose a framework for efficiently constructing evaluators that perceive multiple facets of content quality evaluation.
We leverage a joint training strategy based on contrastive learning and supervised learning to enable the evaluator to distinguish between different quality facets.
arXiv Detail & Related papers (2024-10-10T08:04:10Z) - Analysis of the ICML 2023 Ranking Data: Can Authors' Opinions of Their Own Papers Assist Peer Review in Machine Learning? [52.00419656272129]
We conducted an experiment during the 2023 International Conference on Machine Learning (ICML)
We received 1,342 rankings, each from a distinct author, pertaining to 2,592 submissions.
We focus on the Isotonic Mechanism, which calibrates raw review scores using author-provided rankings.
arXiv Detail & Related papers (2024-08-24T01:51:23Z) - A Literature Review of Literature Reviews in Pattern Analysis and Machine Intelligence [55.33653554387953]
Pattern Analysis and Machine Intelligence (PAMI) has led to numerous literature reviews aimed at collecting and fragmented information.<n>This paper presents a thorough analysis of these literature reviews within the PAMI field.<n>We try to address three core research questions: (1) What are the prevalent structural and statistical characteristics of PAMI literature reviews; (2) What strategies can researchers employ to efficiently navigate the growing corpus of reviews; and (3) What are the advantages and limitations of AI-generated reviews compared to human-authored ones.
arXiv Detail & Related papers (2024-02-20T11:28:50Z) - Eliciting Honest Information From Authors Using Sequential Review [13.424398627546788]
We propose a sequential review mechanism that can truthfully elicit the ranking information from authors.
The key idea is to review the papers of an author in a sequence based on the provided ranking and conditioning the review of the next paper on the review scores of the previous papers.
arXiv Detail & Related papers (2023-11-24T17:27:39Z) - Consultation Checklists: Standardising the Human Evaluation of Medical
Note Generation [58.54483567073125]
We propose a protocol that aims to increase objectivity by grounding evaluations in Consultation Checklists.
We observed good levels of inter-annotator agreement in a first evaluation study using the protocol.
arXiv Detail & Related papers (2022-11-17T10:54:28Z) - Investigating Crowdsourcing Protocols for Evaluating the Factual
Consistency of Summaries [59.27273928454995]
Current pre-trained models applied to summarization are prone to factual inconsistencies which misrepresent the source text or introduce extraneous information.
We create a crowdsourcing evaluation framework for factual consistency using the rating-based Likert scale and ranking-based Best-Worst Scaling protocols.
We find that ranking-based protocols offer a more reliable measure of summary quality across datasets, while the reliability of Likert ratings depends on the target dataset and the evaluation design.
arXiv Detail & Related papers (2021-09-19T19:05:00Z) - Ranking Scientific Papers Using Preference Learning [48.78161994501516]
We cast it as a paper ranking problem based on peer review texts and reviewer scores.
We introduce a novel, multi-faceted generic evaluation framework for making final decisions based on peer reviews.
arXiv Detail & Related papers (2021-09-02T19:41:47Z) - Making Paper Reviewing Robust to Bid Manipulation Attacks [44.34601846490532]
Anecdotal evidence suggests that some reviewers bid on papers by "friends" or colluding authors.
We develop a novel approach for paper bidding and assignment that is much more robust against such attacks.
In addition to being more robust, the quality of our paper review assignments is comparable to that of current, non-robust assignment approaches.
arXiv Detail & Related papers (2021-02-09T21:24:16Z) - Debiasing Evaluations That are Biased by Evaluations [32.135315382120154]
We consider the problem of mitigating outcome-induced biases in ratings when some information about the outcome is available.
We propose a debiasing method by solving a regularized optimization problem under this ordering constraint.
We also provide a carefully designed cross-validation method that adaptively chooses the appropriate amount of regularization.
arXiv Detail & Related papers (2020-12-01T18:20:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.