You Are the Best Reviewer of Your Own Papers: The Isotonic Mechanism
- URL: http://arxiv.org/abs/2206.08149v2
- Date: Wed, 05 Mar 2025 19:46:11 GMT
- Title: You Are the Best Reviewer of Your Own Papers: The Isotonic Mechanism
- Authors: Weijie Su,
- Abstract summary: We introduce the Isotonic Mechanism to enhance the accuracy of noisy review scores.<n>Authors with multiple submissions are required to rank their papers in descending order of perceived quality.<n> adjusted scores are shown to be more accurate than the raw scores.
- Score: 1.7741566627076264
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Machine learning (ML) and artificial intelligence (AI) conferences including NeurIPS and ICML have experienced a significant decline in peer review quality in recent years. To address this growing challenge, we introduce the Isotonic Mechanism, a computationally efficient approach to enhancing the accuracy of noisy review scores by incorporating authors' private assessments of their submissions. Under this mechanism, authors with multiple submissions are required to rank their papers in descending order of perceived quality. Subsequently, the raw review scores are calibrated based on this ranking to produce adjusted scores. We prove that authors are incentivized to truthfully report their rankings because doing so maximizes their expected utility, modeled as an additive convex function over the adjusted scores. Moreover, the adjusted scores are shown to be more accurate than the raw scores, with improvements being particularly significant when the noise level is high and the author has many submissions -- a scenario increasingly prevalent at large-scale ML/AI conferences. We further investigate whether submission quality information beyond a simple ranking can be truthfully elicited from authors. We establish that a necessary condition for truthful elicitation is that the mechanism be based on pairwise comparisons of the author's submissions. This result underscores the optimality of the Isotonic Mechanism, as it elicits the most fine-grained truthful information among all mechanisms we consider. We then present several extensions, including a demonstration that the mechanism maintains truthfulness even when authors have only partial rather than complete information about their submission quality. Finally, we discuss future research directions, focusing on the practical implementation of the mechanism and the further development of a theoretical framework inspired by our mechanism.
Related papers
- R-PRM: Reasoning-Driven Process Reward Modeling [53.06844294668382]
Process Reward Models (PRMs) have emerged as a promising solution by evaluating each reasoning step.
Existing PRMs typically output evaluation scores directly, limiting both learning efficiency and evaluation accuracy.
We propose Reasoning-Driven Process Reward Modeling (R-PRM)
R-PRM generates seed data from limited annotations, effectively bootstrapping our model's reasoning capabilities.
arXiv Detail & Related papers (2025-03-27T09:23:08Z) - Analysis of the ICML 2023 Ranking Data: Can Authors' Opinions of Their Own Papers Assist Peer Review in Machine Learning? [52.00419656272129]
We conducted an experiment during the 2023 International Conference on Machine Learning (ICML)
We received 1,342 rankings, each from a distinct author, pertaining to 2,592 submissions.
We focus on the Isotonic Mechanism, which calibrates raw review scores using author-provided rankings.
arXiv Detail & Related papers (2024-08-24T01:51:23Z) - Evaluating Human Alignment and Model Faithfulness of LLM Rationale [66.75309523854476]
We study how well large language models (LLMs) explain their generations through rationales.
We show that prompting-based methods are less "faithful" than attribution-based explanations.
arXiv Detail & Related papers (2024-06-28T20:06:30Z) - Eliciting Informative Text Evaluations with Large Language Models [14.176332393753906]
We introduce two mechanisms, the Generative Peer Prediction Mechanism (GPPM) and the Generative Synopsis Peer Prediction Mechanism (GSPPM)
We show that our mechanisms can incentivize high effort and truth-telling as an (approximate) Bayesian Nash equilibrium.
We highlight the results that on the ICLR dataset, our mechanisms can differentiate three quality levels -- human-written reviews, GPT-4-generated reviews, and GPT-3.5-generated reviews in terms of expected scores.
arXiv Detail & Related papers (2024-05-23T21:56:12Z) - Additive-Effect Assisted Learning [17.408937094829007]
We develop a two-stage assisted learning architecture for an agent, Alice, to seek assistance from another agent, Bob.
In the first stage, we propose a privacy-aware hypothesis testing-based screening method for Alice to decide on the usefulness of the data from Bob.
We show that Alice can achieve the oracle performance as if the training were from centralized data, both theoretically and numerically.
arXiv Detail & Related papers (2024-05-13T23:24:25Z) - What Hides behind Unfairness? Exploring Dynamics Fairness in Reinforcement Learning [52.51430732904994]
In reinforcement learning problems, agents must consider long-term fairness while maximizing returns.
Recent works have proposed many different types of fairness notions, but how unfairness arises in RL problems remains unclear.
We introduce a novel notion called dynamics fairness, which explicitly captures the inequality stemming from environmental dynamics.
arXiv Detail & Related papers (2024-04-16T22:47:59Z) - Alignment for Honesty [105.72465407518325]
Recent research has made significant strides in aligning large language models (LLMs) with helpfulness and harmlessness.
In this paper, we argue for the importance of alignment for emphhonesty, ensuring that LLMs proactively refuse to answer questions when they lack knowledge.
We address these challenges by first establishing a precise problem definition and defining honesty'' inspired by the Analects of Confucius.
arXiv Detail & Related papers (2023-12-12T06:10:42Z) - Eliciting Honest Information From Authors Using Sequential Review [13.424398627546788]
We propose a sequential review mechanism that can truthfully elicit the ranking information from authors.
The key idea is to review the papers of an author in a sequence based on the provided ranking and conditioning the review of the next paper on the review scores of the previous papers.
arXiv Detail & Related papers (2023-11-24T17:27:39Z) - Faithful Knowledge Distillation [75.59907631395849]
We focus on two crucial questions with regard to a teacher-student pair: (i) do the teacher and student disagree at points close to correctly classified dataset examples, and (ii) is the distilled student as confident as the teacher around dataset examples?
These are critical questions when considering the deployment of a smaller student network trained from a robust teacher within a safety-critical setting.
arXiv Detail & Related papers (2023-06-07T13:41:55Z) - Isotonic Mechanism for Exponential Family Estimation in Machine Learning Peer Review [28.06558596439521]
In 2023, the International Conference on Machine Learning (ICML) required authors with multiple submissions to rank their submissions based on perceived quality.
We employ these author-specified rankings to enhance peer review in machine learning and artificial intelligence conferences.
We generate adjusted scores that closely align with the original scores while adhering to author-specified rankings.
arXiv Detail & Related papers (2023-04-21T17:59:08Z) - Adam: Dense Retrieval Distillation with Adaptive Dark Examples [104.01735794498767]
We propose ADAM, a knowledge distillation framework that can better transfer the dark knowledge held in the teacher with Adaptive Dark exAMples.
We conduct experiments on two widely-used benchmarks and verify the effectiveness of our method.
arXiv Detail & Related papers (2022-12-20T12:03:19Z) - ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning [63.77667876176978]
Large language models show improved downstream task interpretability when prompted to generate step-by-step reasoning to justify their final answers.
These reasoning steps greatly improve model interpretability and verification, but objectively studying their correctness is difficult.
We present ROS, a suite of interpretable, unsupervised automatic scores that improve and extend previous text generation evaluation metrics.
arXiv Detail & Related papers (2022-12-15T15:52:39Z) - Identifying the value of a random variable unambiguously: Quantum versus classical approaches [44.99833362998488]
Quantum resources may provide advantage over their classical counterparts.
We construct such a task based on a game, mediated by Referee and played between Alice and Bob.
We show that if Alice sends limited amount of classical information then the game cannot be won while the quantum analogue of the limited amount of classical information' is sufficient for winning the game.
arXiv Detail & Related papers (2022-11-16T20:28:49Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z) - Simultaneous Measurement and Entanglement [0.0]
We show how LOCC allows Alice and Bob to distinguish between two product states optimally.
We find that a LOCC is almost always more helpful than a Bell pair for distinguishing product states.
arXiv Detail & Related papers (2022-01-25T23:11:29Z) - You Are the Best Reviewer of Your Own Papers: An Owner-Assisted Scoring
Mechanism [17.006003864727408]
Isotonic mechanism improves on imprecise raw scores by leveraging certain information that the owner is incentivized to provide.
It reports adjusted scores for the items by solving a convex optimization problem.
I prove that the adjusted scores provided by this owner-assisted mechanism are indeed significantly more accurate than the raw scores provided by the reviewers.
arXiv Detail & Related papers (2021-10-27T22:11:29Z) - Ranking Scientific Papers Using Preference Learning [48.78161994501516]
We cast it as a paper ranking problem based on peer review texts and reviewer scores.
We introduce a novel, multi-faceted generic evaluation framework for making final decisions based on peer reviews.
arXiv Detail & Related papers (2021-09-02T19:41:47Z) - Linear Gaussian Quantum State Smoothing: Understanding the optimal
unravelings for Alice to estimate Bob's state [0.0]
Quantum state smoothing is a technique to construct an estimate of the quantum state at a particular time.
The effect of Bob's measurement choice on the effectiveness of Alice's smoothing has been studied in a number of recent papers.
We develop a simple hypothesis that allows one to approximate the optimal measurement choice given Alice's measurement choice.
arXiv Detail & Related papers (2020-08-31T04:04:50Z) - Superadditivity of channel capacity through quantum fields [0.0]
We study the scenario where a sender, Alice, causes information-carrying disturbances in a quantum field.
We find that the channel capacity between Alice and a receiver, Bob, is enhanced by Bob placing detectors not only inside but also outside the causal future of Alice's encoding operation.
arXiv Detail & Related papers (2020-02-11T00:53:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.