The Price of Strategyproofing Peer Assessment
- URL: http://arxiv.org/abs/2201.10631v1
- Date: Tue, 25 Jan 2022 21:16:33 GMT
- Title: The Price of Strategyproofing Peer Assessment
- Authors: Komal Dhull, Steven Jecmen, Pravesh Kothari, Nihar B. Shah
- Abstract summary: Strategic behavior is a fundamental problem in a variety of real-world applications that require some form of peer assessment.
Since an individual's own work is in competition with the submissions they are evaluating, they may provide dishonest evaluations to increase the relative standing of their own submission.
This issue is typically addressed by partitioning the individuals and assigning them to evaluate the work of only those from different subsets.
- Score: 30.51994705981846
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Strategic behavior is a fundamental problem in a variety of real-world
applications that require some form of peer assessment, such as peer grading of
assignments, grant proposal review, conference peer review, and peer assessment
of employees. Since an individual's own work is in competition with the
submissions they are evaluating, they may provide dishonest evaluations to
increase the relative standing of their own submission. This issue is typically
addressed by partitioning the individuals and assigning them to evaluate the
work of only those from different subsets. Although this method ensures
strategyproofness, each submission may require a different type of expertise
for effective evaluation. In this paper, we focus on finding an assignment of
evaluators to submissions that maximizes assigned expertise subject to the
constraint of strategyproofness. We analyze the price of strategyproofness:
that is, the amount of compromise on the assignment quality required in order
to get strategyproofness. We establish several polynomial-time algorithms for
strategyproof assignment along with assignment-quality guarantees. Finally, we
evaluate the methods on a dataset from conference peer review.
Related papers
- Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition [70.60872754129832]
First NeurIPS competition on unlearning sought to stimulate the development of novel algorithms.
Nearly 1,200 teams from across the world participated.
We analyze top solutions and delve into discussions on benchmarking unlearning.
arXiv Detail & Related papers (2024-06-13T12:58:00Z) - ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate [57.71597869337909]
We build a multi-agent referee team called ChatEval to autonomously discuss and evaluate the quality of generated responses from different models.
Our analysis shows that ChatEval transcends mere textual scoring, offering a human-mimicking evaluation process for reliable assessments.
arXiv Detail & Related papers (2023-08-14T15:13:04Z) - Provable Benefits of Policy Learning from Human Preferences in
Contextual Bandit Problems [82.92678837778358]
preference-based methods have demonstrated substantial success in empirical applications such as InstructGPT.
We show how human bias and uncertainty in feedback modelings can affect the theoretical guarantees of these approaches.
arXiv Detail & Related papers (2023-07-24T17:50:24Z) - A Dataset on Malicious Paper Bidding in Peer Review [84.68308372858755]
Malicious reviewers strategically bid in order to unethically manipulate the paper assignment.
A critical impediment towards creating and evaluating methods to mitigate this issue is the lack of publicly-available data on malicious paper bidding.
We release a novel dataset, collected from a mock conference activity where participants were instructed to bid either honestly or maliciously.
arXiv Detail & Related papers (2022-06-24T20:23:33Z) - Evaluating Feature Attribution: An Information-Theoretic Perspective [21.101718565039015]
We present an information-theoretic analysis of evaluation strategies based on pixel perturbations.
Our findings reveal that the results output by different evaluation strategies are strongly affected by information leakage through the shape of the removed pixels.
We propose a novel evaluation framework termed Remove and Debias (ROAD) which offers two contributions.
arXiv Detail & Related papers (2022-02-01T15:00:26Z) - Improving Peer Assessment with Graph Convolutional Networks [2.105564340986074]
Peer assessment might not be as accurate as expert evaluations, thus rendering these systems unreliable.
We first model peer assessment as multi-relational weighted networks that can express a variety of peer assessment setups.
We introduce a graph convolutional network which can learn assessment patterns and user behaviors to more accurately predict expert evaluations.
arXiv Detail & Related papers (2021-11-04T03:43:09Z) - Catch Me if I Can: Detecting Strategic Behaviour in Peer Assessment [61.24399136715106]
We consider the issue of strategic behaviour in various peer-assessment tasks, including peer grading of exams or homeworks and peer review in hiring or promotions.
Our focus is on designing methods for detection of such manipulations.
Specifically, we consider a setting in which agents evaluate a subset of their peers and output rankings that are later aggregated to form a final ordering.
arXiv Detail & Related papers (2020-10-08T15:08:40Z) - Mitigating Manipulation in Peer Review via Randomized Reviewer
Assignments [96.114824979298]
Three important challenges in conference peer review are maliciously attempting to get assigned to certain papers and "torpedo reviewing"
We present a framework that brings all these challenges under a common umbrella and present a (randomized) algorithm for reviewer assignment.
Our algorithms can limit the chance that any malicious reviewer gets assigned to their desired paper to 50% while producing assignments with over 90% of the total optimal similarity.
arXiv Detail & Related papers (2020-06-29T23:55:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.