Modeling and Correcting Bias in Sequential Evaluation
- URL: http://arxiv.org/abs/2205.01607v3
- Date: Thu, 16 Nov 2023 19:44:01 GMT
- Title: Modeling and Correcting Bias in Sequential Evaluation
- Authors: Jingyan Wang and Ashwin Pananjady
- Abstract summary: We consider the problem of sequential evaluation, in which an evaluator observes candidates in a sequence and assigns scores to these candidates in an online, irrevocable fashion.
Motivated by the psychology literature that has studied sequential bias in such settings, we propose a natural model for the evaluator's rating process.
We conduct crowdsourcing experiments to demonstrate various facets of our model.
- Score: 10.852140754372193
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of sequential evaluation, in which an evaluator
observes candidates in a sequence and assigns scores to these candidates in an
online, irrevocable fashion. Motivated by the psychology literature that has
studied sequential bias in such settings -- namely, dependencies between the
evaluation outcome and the order in which the candidates appear -- we propose a
natural model for the evaluator's rating process that captures the lack of
calibration inherent to such a task. We conduct crowdsourcing experiments to
demonstrate various facets of our model. We then proceed to study how to
correct sequential bias under our model by posing this as a statistical
inference problem. We propose a near-linear time, online algorithm for this
task and prove guarantees in terms of two canonical ranking metrics. We also
prove that our algorithm is information theoretically optimal, by establishing
matching lower bounds in both metrics. Finally, we perform a host of numerical
experiments to show that our algorithm often outperforms the de facto method of
using the rankings induced by the reported scores, both in simulation and on
the crowdsourcing data that we collected.
Related papers
- Bipartite Ranking Fairness through a Model Agnostic Ordering Adjustment [54.179859639868646]
We propose a model agnostic post-processing framework xOrder for achieving fairness in bipartite ranking.
xOrder is compatible with various classification models and ranking fairness metrics, including supervised and unsupervised fairness metrics.
We evaluate our proposed algorithm on four benchmark data sets and two real-world patient electronic health record repositories.
arXiv Detail & Related papers (2023-07-27T07:42:44Z) - In Search of Insights, Not Magic Bullets: Towards Demystification of the
Model Selection Dilemma in Heterogeneous Treatment Effect Estimation [92.51773744318119]
This paper empirically investigates the strengths and weaknesses of different model selection criteria.
We highlight that there is a complex interplay between selection strategies, candidate estimators and the data used for comparing them.
arXiv Detail & Related papers (2023-02-06T16:55:37Z) - Revisiting Long-tailed Image Classification: Survey and Benchmarks with
New Evaluation Metrics [88.39382177059747]
A corpus of metrics is designed for measuring the accuracy, robustness, and bounds of algorithms for learning with long-tailed distribution.
Based on our benchmarks, we re-evaluate the performance of existing methods on CIFAR10 and CIFAR100 datasets.
arXiv Detail & Related papers (2023-02-03T02:40:54Z) - Online Statistical Inference for Matrix Contextual Bandit [3.465827582464433]
Contextual bandit has been widely used for sequential decision-making based on contextual information and historical feedback data.
We introduce a new online doubly-debiasing inference procedure to simultaneously handle both sources of bias.
Our inference results are built upon a newly developed low-rank gradient descent estimator and its non-asymptotic convergence result.
arXiv Detail & Related papers (2022-12-21T22:03:06Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Enhancing Counterfactual Classification via Self-Training [9.484178349784264]
We propose a self-training algorithm which imputes outcomes with categorical values for finite unseen actions in observational data to simulate a randomized trial through pseudolabeling.
We demonstrate the effectiveness of the proposed algorithms on both synthetic and real datasets.
arXiv Detail & Related papers (2021-12-08T18:42:58Z) - Learning to Rank Anomalies: Scalar Performance Criteria and Maximization
of Two-Sample Rank Statistics [0.0]
We propose a data-driven scoring function defined on the feature space which reflects the degree of abnormality of the observations.
This scoring function is learnt through a well-designed binary classification problem.
We illustrate our methodology with preliminary encouraging numerical experiments.
arXiv Detail & Related papers (2021-09-20T14:45:56Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z) - Evaluating Text Coherence at Sentence and Paragraph Levels [17.99797111176988]
We investigate the adaptation of existing sentence ordering methods to a paragraph ordering task.
We also compare the learnability and robustness of existing models by artificially creating mini datasets and noisy datasets.
We conclude that the recurrent graph neural network-based model is an optimal choice for coherence modeling.
arXiv Detail & Related papers (2020-06-05T03:31:49Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.