Toward Annotator Group Bias in Crowdsourcing
- URL: http://arxiv.org/abs/2110.08038v1
- Date: Fri, 8 Oct 2021 05:54:36 GMT
- Title: Toward Annotator Group Bias in Crowdsourcing
- Authors: Haochen Liu, Joseph Thekinen, Sinem Mollaoglu, Da Tang, Ji Yang,
Youlong Cheng, Hui Liu, Jiliang Tang
- Abstract summary: We show that annotators within the same demographic group tend to show consistent group bias in annotation tasks.
We develop a novel probabilistic graphical framework GroupAnno to capture annotator group bias with a new extended Expectation Maximization (EM) training algorithm.
- Score: 26.754873038110595
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crowdsourcing has emerged as a popular approach for collecting annotated data
to train supervised machine learning models. However, annotator bias can lead
to defective annotations. Though there are a few works investigating individual
annotator bias, the group effects in annotators are largely overlooked. In this
work, we reveal that annotators within the same demographic group tend to show
consistent group bias in annotation tasks and thus we conduct an initial study
on annotator group bias. We first empirically verify the existence of annotator
group bias in various real-world crowdsourcing datasets. Then, we develop a
novel probabilistic graphical framework GroupAnno to capture annotator group
bias with a new extended Expectation Maximization (EM) training algorithm. We
conduct experiments on both synthetic and real-world datasets. Experimental
results demonstrate the effectiveness of our model in modeling annotator group
bias in label aggregation and model learning over competitive baselines.
Related papers
- Improving Bias Mitigation through Bias Experts in Natural Language
Understanding [10.363406065066538]
We propose a new debiasing framework that introduces binary classifiers between the auxiliary model and the main model.
Our proposed strategy improves the bias identification ability of the auxiliary model.
arXiv Detail & Related papers (2023-12-06T16:15:00Z) - IBADR: an Iterative Bias-Aware Dataset Refinement Framework for
Debiasing NLU models [52.03761198830643]
We propose IBADR, an Iterative Bias-Aware dataset Refinement framework.
We first train a shallow model to quantify the bias degree of samples in the pool.
Then, we pair each sample with a bias indicator representing its bias degree, and use these extended samples to train a sample generator.
In this way, this generator can effectively learn the correspondence relationship between bias indicators and samples.
arXiv Detail & Related papers (2023-11-01T04:50:38Z) - Bias Propagation in Federated Learning [22.954608704251118]
We show that the bias of a few parties against under-represented groups can propagate through the network to all the parties in the network.
We analyze and explain bias propagation in federated learning on naturally partitioned real-world datasets.
arXiv Detail & Related papers (2023-09-05T11:55:03Z) - Echoes: Unsupervised Debiasing via Pseudo-bias Labeling in an Echo
Chamber [17.034228910493056]
This paper presents experimental analyses revealing that the existing biased models overfit to bias-conflicting samples in the training data.
We propose a straightforward and effective method called Echoes, which trains a biased model and a target model with a different strategy.
Our approach achieves superior debiasing results compared to the existing baselines on both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-05-06T13:13:18Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Fair Group-Shared Representations with Normalizing Flows [68.29997072804537]
We develop a fair representation learning algorithm which is able to map individuals belonging to different groups in a single group.
We show experimentally that our methodology is competitive with other fair representation learning algorithms.
arXiv Detail & Related papers (2022-01-17T10:49:49Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning Debiased Models with Dynamic Gradient Alignment and
Bias-conflicting Sample Mining [39.00256193731365]
Deep neural networks notoriously suffer from dataset biases which are detrimental to model robustness, generalization and fairness.
We propose a two-stage debiasing scheme to combat against the intractable unknown biases.
arXiv Detail & Related papers (2021-11-25T14:50:10Z) - Improving Robustness by Augmenting Training Sentences with
Predicate-Argument Structures [62.562760228942054]
Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective.
We propose to augment the input sentences in the training data with their corresponding predicate-argument structures.
We show that without targeting a specific bias, our sentence augmentation improves the robustness of transformer models against multiple biases.
arXiv Detail & Related papers (2020-10-23T16:22:05Z) - LOGAN: Local Group Bias Detection by Clustering [86.38331353310114]
We argue that evaluating bias at the corpus level is not enough for understanding how biases are embedded in a model.
We propose LOGAN, a new bias detection technique based on clustering.
Experiments on toxicity classification and object classification tasks show that LOGAN identifies bias in a local region.
arXiv Detail & Related papers (2020-10-06T16:42:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.