Leveraging Clickstream Trajectories to Reveal Low-Quality Workers in
Crowdsourced Forecasting Platforms
- URL: http://arxiv.org/abs/2009.01966v1
- Date: Fri, 4 Sep 2020 00:26:38 GMT
- Title: Leveraging Clickstream Trajectories to Reveal Low-Quality Workers in
Crowdsourced Forecasting Platforms
- Authors: Akira Matsui, Emilio Ferrara, Fred Morstatter, Andres Abeliuk, Aram
Galstyan
- Abstract summary: We propose the use of a computational framework to identify clusters of underperforming workers using clickstream trajectories.
The framework can reveal different types of underperformers, such as workers with forecasts whose accuracy is far from the consensus of the crowd.
Our study suggests that clickstream clustering and analysis are fundamental tools to diagnose the performance of crowdworkers in platforms leveraging the wisdom of crowds.
- Score: 22.995941896769843
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crowdwork often entails tackling cognitively-demanding and time-consuming
tasks. Crowdsourcing can be used for complex annotation tasks, from medical
imaging to geospatial data, and such data powers sensitive applications, such
as health diagnostics or autonomous driving. However, the existence and
prevalence of underperforming crowdworkers is well-recognized, and can pose a
threat to the validity of crowdsourcing. In this study, we propose the use of a
computational framework to identify clusters of underperforming workers using
clickstream trajectories. We focus on crowdsourced geopolitical forecasting.
The framework can reveal different types of underperformers, such as workers
with forecasts whose accuracy is far from the consensus of the crowd, those who
provide low-quality explanations for their forecasts, and those who simply
copy-paste their forecasts from other users. Our study suggests that
clickstream clustering and analysis are fundamental tools to diagnose the
performance of crowdworkers in platforms leveraging the wisdom of crowds.
Related papers
- Data Quality in Crowdsourcing and Spamming Behavior Detection [2.6481162211614118]
We introduce a systematic method for evaluating data quality and detecting spamming threats via variance decomposition.
A spammer index is proposed to assess entire data consistency and two metrics are developed to measure crowd worker's credibility.
arXiv Detail & Related papers (2024-04-04T02:21:38Z) - Adaptive Crowdsourcing Via Self-Supervised Learning [20.393114559367202]
Common crowdsourcing systems average estimates of a latent quantity of interest provided by many crowdworkers to produce a group estimate.
We develop a new approach -- predict-each-worker -- that leverages self-supervised learning and a novel aggregation scheme.
arXiv Detail & Related papers (2024-01-24T05:57:36Z) - Decentralized Adversarial Training over Graphs [55.28669771020857]
The vulnerability of machine learning models to adversarial attacks has been attracting considerable attention in recent years.
This work studies adversarial training over graphs, where individual agents are subjected to varied strength perturbation space.
arXiv Detail & Related papers (2023-03-23T15:05:16Z) - Mitigating Observation Biases in Crowdsourced Label Aggregation [19.460509608096217]
One of the technical challenges in obtaining high-quality results from crowdsourcing is dealing with the variability and bias caused by the fact that it is humans execute the work.
In this study, we focus on the observation bias in crowdsourcing.
Variations in the frequency of worker responses and the complexity of tasks occur, which may affect the aggregation results.
arXiv Detail & Related papers (2023-02-25T15:19:13Z) - Learning from Heterogeneous Data Based on Social Interactions over
Graphs [58.34060409467834]
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
arXiv Detail & Related papers (2021-12-17T12:47:18Z) - Crowdsourcing with Meta-Workers: A New Way to Save the Budget [50.04836252733443]
We introduce the concept of emphmeta-worker, a machine annotator trained by meta learning for types of tasks that are well-fit for AI.
Unlike regular crowd workers, meta-workers can be reliable, stable, and more importantly, tireless and free.
arXiv Detail & Related papers (2021-11-07T12:40:29Z) - Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture.
We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z) - Detecting adversaries in Crowdsourcing [71.20185379303479]
This work investigates the effects of adversaries on crowdsourced classification, under the popular Dawid and Skene model.
The adversaries are allowed to deviate arbitrarily from the considered crowdsourcing model, and may potentially cooperate.
We develop an approach that leverages the structure of second-order moments of annotator responses, to identify large numbers of adversaries, and mitigate their impact on the crowdsourcing task.
arXiv Detail & Related papers (2021-10-07T15:07:07Z) - Bayesian Semi-supervised Crowdsourcing [71.20185379303479]
Crowdsourcing has emerged as a powerful paradigm for efficiently labeling large datasets and performing various learning tasks.
This work deals with semi-supervised crowdsourced classification, under two regimes of semi-supervision.
arXiv Detail & Related papers (2020-12-20T23:18:51Z) - Variational Bayesian Inference for Crowdsourcing Predictions [6.878219199575748]
We develop a variational Bayesian technique for two different worker noise models.
Our evaluations on synthetic and real-world datasets demonstrate that these approaches perform significantly better than existing non-Bayesian approaches.
arXiv Detail & Related papers (2020-06-01T08:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.