Debiasing Evaluations That are Biased by Evaluations
- URL: http://arxiv.org/abs/2012.00714v1
- Date: Tue, 1 Dec 2020 18:20:43 GMT
- Title: Debiasing Evaluations That are Biased by Evaluations
- Authors: Jingyan Wang, Ivan Stelmakh, Yuting Wei, Nihar B. Shah
- Abstract summary: We consider the problem of mitigating outcome-induced biases in ratings when some information about the outcome is available.
We propose a debiasing method by solving a regularized optimization problem under this ordering constraint.
We also provide a carefully designed cross-validation method that adaptively chooses the appropriate amount of regularization.
- Score: 32.135315382120154
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is common to evaluate a set of items by soliciting people to rate them.
For example, universities ask students to rate the teaching quality of their
instructors, and conference organizers ask authors of submissions to evaluate
the quality of the reviews. However, in these applications, students often give
a higher rating to a course if they receive higher grades in a course, and
authors often give a higher rating to the reviews if their papers are accepted
to the conference. In this work, we call these external factors the "outcome"
experienced by people, and consider the problem of mitigating these
outcome-induced biases in the given ratings when some information about the
outcome is available. We formulate the information about the outcome as a known
partial ordering on the bias. We propose a debiasing method by solving a
regularized optimization problem under this ordering constraint, and also
provide a carefully designed cross-validation method that adaptively chooses
the appropriate amount of regularization. We provide theoretical guarantees on
the performance of our algorithm, as well as experimental evaluations.
Related papers
- Analytical and Empirical Study of Herding Effects in Recommendation Systems [72.6693986712978]
We study how to manage product ratings via rating aggregation rules and shortlisted representative reviews.
We show that proper recency aware rating aggregation rules can improve the speed of convergence in Amazon and TripAdvisor.
arXiv Detail & Related papers (2024-08-20T14:29:23Z) - Fighting Sampling Bias: A Framework for Training and Evaluating Credit Scoring Models [2.918530881730374]
This paper addresses the adverse effect of sampling bias on model training and evaluation.
We propose bias-aware self-learning and a reject inference framework for scorecard evaluation.
Our results suggest a profit improvement of about eight percent, when using Bayesian evaluation to decide on acceptance rates.
arXiv Detail & Related papers (2024-07-17T20:59:54Z) - Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction [56.17020601803071]
Recent research shows that pre-trained language models (PLMs) suffer from "prompt bias" in factual knowledge extraction.
This paper aims to improve the reliability of existing benchmarks by thoroughly investigating and mitigating prompt bias.
arXiv Detail & Related papers (2024-03-15T02:04:35Z) - A Dataset on Malicious Paper Bidding in Peer Review [84.68308372858755]
Malicious reviewers strategically bid in order to unethically manipulate the paper assignment.
A critical impediment towards creating and evaluating methods to mitigate this issue is the lack of publicly-available data on malicious paper bidding.
We release a novel dataset, collected from a mock conference activity where participants were instructed to bid either honestly or maliciously.
arXiv Detail & Related papers (2022-06-24T20:23:33Z) - Integrating Rankings into Quantized Scores in Peer Review [61.27794774537103]
In peer review, reviewers are usually asked to provide scores for the papers.
To mitigate this issue, conferences have started to ask reviewers to additionally provide a ranking of the papers they have reviewed.
There are no standard procedure for using this ranking information and Area Chairs may use it in different ways.
We take a principled approach to integrate the ranking information into the scores.
arXiv Detail & Related papers (2022-04-05T19:39:13Z) - Correcting the User Feedback-Loop Bias for Recommendation Systems [34.44834423714441]
We propose a systematic and dynamic way to correct user feedback-loop bias in recommendation systems.
Our method includes a deep-learning component to learn each user's dynamic rating history embedding.
We empirically validated the existence of such user feedback-loop bias in real world recommendation systems.
arXiv Detail & Related papers (2021-09-13T15:02:55Z) - Ranking Scientific Papers Using Preference Learning [48.78161994501516]
We cast it as a paper ranking problem based on peer review texts and reviewer scores.
We introduce a novel, multi-faceted generic evaluation framework for making final decisions based on peer reviews.
arXiv Detail & Related papers (2021-09-02T19:41:47Z) - Prior and Prejudice: The Novice Reviewers' Bias against Resubmissions in
Conference Peer Review [35.24369486197371]
Modern machine learning and computer science conferences are experiencing a surge in the number of submissions that challenges the quality of peer review.
Several conferences have started encouraging or even requiring authors to declare the previous submission history of their papers.
We investigate whether reviewers exhibit a bias caused by the knowledge that the submission under review was previously rejected at a similar venue.
arXiv Detail & Related papers (2020-11-30T09:35:37Z) - Mitigating Manipulation in Peer Review via Randomized Reviewer
Assignments [96.114824979298]
Three important challenges in conference peer review are maliciously attempting to get assigned to certain papers and "torpedo reviewing"
We present a framework that brings all these challenges under a common umbrella and present a (randomized) algorithm for reviewer assignment.
Our algorithms can limit the chance that any malicious reviewer gets assigned to their desired paper to 50% while producing assignments with over 90% of the total optimal similarity.
arXiv Detail & Related papers (2020-06-29T23:55:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.