Least Square Calibration for Peer Review
- URL: http://arxiv.org/abs/2110.12607v1
- Date: Mon, 25 Oct 2021 02:40:33 GMT
- Title: Least Square Calibration for Peer Review
- Authors: Sijun Tan, Jibang Wu, Xiaohui Bei, Haifeng Xu
- Abstract summary: We propose a flexible framework, namely least square calibration (LSC), for selecting top candidates from peer ratings.
Our framework provably performs perfect calibration from noiseless linear scoring functions under mild assumptions.
Our algorithm consistently outperforms the baseline which select top papers based on the highest average ratings.
- Score: 18.063450032460047
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Peer review systems such as conference paper review often suffer from the
issue of miscalibration. Previous works on peer review calibration usually only
use the ordinal information or assume simplistic reviewer scoring functions
such as linear functions. In practice, applications like academic conferences
often rely on manual methods, such as open discussions, to mitigate
miscalibration. It remains an important question to develop algorithms that can
handle different types of miscalibrations based on available prior knowledge.
In this paper, we propose a flexible framework, namely least square calibration
(LSC), for selecting top candidates from peer ratings. Our framework provably
performs perfect calibration from noiseless linear scoring functions under mild
assumptions, yet also provides competitive calibration results when the scoring
function is from broader classes beyond linear functions and with arbitrary
noise. On our synthetic dataset, we empirically demonstrate that our algorithm
consistently outperforms the baseline which select top papers based on the
highest average ratings.
Related papers
- Calibration-Disentangled Learning and Relevance-Prioritized Reranking for Calibrated Sequential Recommendation [18.913912876509187]
Calibrated recommendation aims to maintain personalized proportions of categories within recommendations.
Previous methods typically leverage reranking algorithms to calibrate recommendations after training a model.
We propose LeapRec, a novel approach for the calibrated sequential recommendation.
arXiv Detail & Related papers (2024-08-04T22:23:09Z) - Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
We study the problem of semantic segmentation calibration.
Model capacity, crop size, multi-scale testing, and prediction correctness have impact on calibration.
We propose a simple, unifying, and effective approach, namely selective scaling.
arXiv Detail & Related papers (2022-12-22T22:05:16Z) - Estimating Classification Confidence Using Kernel Densities [0.0]
This paper investigates the post-hoc calibration of confidence for "exploratory" machine learning classification problems.
We introduce and test four new algorithms designed to handle the idiosyncrasies of category-specific confidence estimation.
arXiv Detail & Related papers (2022-07-13T21:57:44Z) - Investigation of Different Calibration Methods for Deep Speaker
Embedding based Verification Systems [66.61691401921296]
This paper presents an investigation over several methods of score calibration for deep speaker embedding extractors.
An additional focus of this research is to estimate the impact of score normalization on the calibration performance of the system.
arXiv Detail & Related papers (2022-03-28T21:22:22Z) - Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it.
We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z) - Multi-Class Uncertainty Calibration via Mutual Information
Maximization-based Binning [8.780958735684958]
Post-hoc multi-class calibration is a common approach for providing confidence estimates of deep neural network predictions.
Recent work has shown that widely used scaling methods underestimate their calibration error.
We propose a shared class-wise (sCW) calibration strategy, sharing one calibrator among similar classes.
arXiv Detail & Related papers (2020-06-23T15:31:59Z) - Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions.
We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test.
Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z) - Mix-n-Match: Ensemble and Compositional Methods for Uncertainty
Calibration in Deep Learning [21.08664370117846]
We show how Mix-n-Match calibration strategies can help achieve remarkably better data-efficiency and expressive power.
We also reveal potential issues in standard evaluation practices.
Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks.
arXiv Detail & Related papers (2020-03-16T17:00:35Z) - Better Classifier Calibration for Small Data Sets [0.0]
We show how generating more data for calibration is able to improve calibration algorithm performance.
The proposed approach adds computational cost but considering that the main use case is with small data sets this extra computational cost stays insignificant.
arXiv Detail & Related papers (2020-02-24T12:27:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.