The Isotonic Mechanism for Exponential Family Estimation
- URL: http://arxiv.org/abs/2304.11160v3
- Date: Mon, 2 Oct 2023 14:33:05 GMT
- Title: The Isotonic Mechanism for Exponential Family Estimation
- Authors: Yuling Yan, Weijie J. Su, Jianqing Fan
- Abstract summary: In 2023, the International Conference on Machine Learning (ICML) required authors with multiple submissions to rank their submissions based on perceived quality.
In this paper, we aim to employ these author-specified rankings to enhance peer review in machine learning and artificial intelligence conferences.
This mechanism generates adjusted scores that closely align with the original scores while adhering to author-specified rankings.
- Score: 31.542906034919977
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In 2023, the International Conference on Machine Learning (ICML) required
authors with multiple submissions to rank their submissions based on perceived
quality. In this paper, we aim to employ these author-specified rankings to
enhance peer review in machine learning and artificial intelligence conferences
by extending the Isotonic Mechanism to exponential family distributions. This
mechanism generates adjusted scores that closely align with the original scores
while adhering to author-specified rankings. Despite its applicability to a
broad spectrum of exponential family distributions, implementing this mechanism
does not require knowledge of the specific distribution form. We demonstrate
that an author is incentivized to provide accurate rankings when her utility
takes the form of a convex additive function of the adjusted review scores. For
a certain subclass of exponential family distributions, we prove that the
author reports truthfully only if the question involves only pairwise
comparisons between her submissions, thus indicating the optimality of ranking
in truthful information elicitation. Moreover, we show that the adjusted scores
improve dramatically the estimation accuracy compared to the original scores
and achieve nearly minimax optimality when the ground-truth scores have bounded
total variation. We conclude the paper by presenting experiments conducted on
the ICML 2023 ranking data, which show significant estimation gain using the
Isotonic Mechanism.
Related papers
- Confidence Diagram of Nonparametric Ranking for Uncertainty Assessment in Large Language Models Evaluation [20.022623972491733]
Ranking large language models (LLMs) has proven to be an effective tool to improve alignment based on the best-of-$N$ policy.
We propose a new inferential framework for hypothesis testing among the ranking for language models.
arXiv Detail & Related papers (2024-12-07T02:34:30Z) - Analysis of the ICML 2023 Ranking Data: Can Authors' Opinions of Their Own Papers Assist Peer Review in Machine Learning? [52.00419656272129]
We conducted an experiment during the 2023 International Conference on Machine Learning (ICML)
We received 1,342 rankings, each from a distinct author, pertaining to 2,592 submissions.
We focus on the Isotonic Mechanism, which calibrates raw review scores using author-provided rankings.
arXiv Detail & Related papers (2024-08-24T01:51:23Z) - Evaluating Human Alignment and Model Faithfulness of LLM Rationale [66.75309523854476]
We study how well large language models (LLMs) explain their generations through rationales.
We show that prompting-based methods are less "faithful" than attribution-based explanations.
arXiv Detail & Related papers (2024-06-28T20:06:30Z) - Being Aware of Localization Accuracy By Generating Predicted-IoU-Guided
Quality Scores [24.086202809990795]
We develop an elegant LQE branch to acquire localization quality score guided by predicted IoU.
A novel one stage detector termed CLQ is proposed.
Experiments show that CLQ achieves state-of-the-arts' performance at an accuracy of 47.8 AP and a speed of 11.5 fps.
arXiv Detail & Related papers (2023-09-23T05:27:59Z) - Predicting article quality scores with machine learning: The UK Research
Excellence Framework [6.582887504429817]
Accuracy is highest in the medical and physical sciences Units of Assessment (UoAs) and economics.
Prediction accuracies above the baseline for the social science, mathematics, engineering, arts, humanities, and UoAs were much lower or close to zero.
We increased accuracy with an active learning strategy and by selecting articles with higher prediction probabilities, as estimated by the algorithms, but this substantially reduced the number of scores predicted.
arXiv Detail & Related papers (2022-12-11T05:45:12Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z) - Re-Examining System-Level Correlations of Automatic Summarization
Evaluation Metrics [64.81682222169113]
How reliably an automatic summarization evaluation metric replicates human judgments of summary quality is quantified by system-level correlations.
We identify two ways in which the definition of the system-level correlation is inconsistent with how metrics are used to evaluate systems in practice.
arXiv Detail & Related papers (2022-04-21T15:52:14Z) - You Are the Best Reviewer of Your Own Papers: An Owner-Assisted Scoring
Mechanism [17.006003864727408]
Isotonic mechanism improves on imprecise raw scores by leveraging certain information that the owner is incentivized to provide.
It reports adjusted scores for the items by solving a convex optimization problem.
I prove that the adjusted scores provided by this owner-assisted mechanism are indeed significantly more accurate than the raw scores provided by the reviewers.
arXiv Detail & Related papers (2021-10-27T22:11:29Z) - Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points.
Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters.
We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z) - Selective Classification Can Magnify Disparities Across Groups [89.14499988774985]
We find that while selective classification can improve average accuracies, it can simultaneously magnify existing accuracy disparities.
Increasing abstentions can even decrease accuracies on some groups.
We train distributionally-robust models that achieve similar full-coverage accuracies across groups and show that selective classification uniformly improves each group.
arXiv Detail & Related papers (2020-10-27T08:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.