Crowdsourcing Learning as Domain Adaptation: A Case Study on Named
Entity Recognition
- URL: http://arxiv.org/abs/2105.14980v1
- Date: Mon, 31 May 2021 14:11:08 GMT
- Title: Crowdsourcing Learning as Domain Adaptation: A Case Study on Named
Entity Recognition
- Authors: Xin Zhang, Guangwei Xu, Yueheng Sun, Meishan Zhang, Pengjun Xie
- Abstract summary: We take a different point in this work, regarding all crowdsourced annotations as gold-standard with respect to the individual annotators.
We find that crowdsourcing could be highly similar to domain adaptation, and then the recent advances of cross-domain methods can be almost directly applied to crowdsourcing.
We investigate both unsupervised and supervised crowdsourcing learning, assuming that no or only small-scale expert annotations are available.
- Score: 19.379850806513232
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crowdsourcing is regarded as one prospective solution for effective
supervised learning, aiming to build large-scale annotated training data by
crowd workers. Previous studies focus on reducing the influences from the
noises of the crowdsourced annotations for supervised models. We take a
different point in this work, regarding all crowdsourced annotations as
gold-standard with respect to the individual annotators. In this way, we find
that crowdsourcing could be highly similar to domain adaptation, and then the
recent advances of cross-domain methods can be almost directly applied to
crowdsourcing. Here we take named entity recognition (NER) as a study case,
suggesting an annotator-aware representation learning model that inspired by
the domain adaptation methods which attempt to capture effective domain-aware
features. We investigate both unsupervised and supervised crowdsourcing
learning, assuming that no or only small-scale expert annotations are
available. Experimental results on a benchmark crowdsourced NER dataset show
that our method is highly effective, leading to a new state-of-the-art
performance. In addition, under the supervised setting, we can achieve
impressive performance gains with only a very small scale of expert
annotations.
Related papers
- Diverse Deep Feature Ensemble Learning for Omni-Domain Generalized Person Re-identification [30.208890289394994]
Person ReID methods experience a significant drop in performance when trained and tested across different datasets.
Our research reveals that domain generalization methods significantly underperform single-domain supervised methods on single dataset benchmarks.
We propose a way to achieve ODG-ReID by creating deep feature diversity with self-ensembles.
arXiv Detail & Related papers (2024-10-11T02:27:11Z) - Improving a Named Entity Recognizer Trained on Noisy Data with a Few
Clean Instances [55.37242480995541]
We propose to denoise noisy NER data with guidance from a small set of clean instances.
Along with the main NER model we train a discriminator model and use its outputs to recalibrate the sample weights.
Results on public crowdsourcing and distant supervision datasets show that the proposed method can consistently improve performance with a small guidance set.
arXiv Detail & Related papers (2023-10-25T17:23:37Z) - CDFSL-V: Cross-Domain Few-Shot Learning for Videos [58.37446811360741]
Few-shot video action recognition is an effective approach to recognizing new categories with only a few labeled examples.
Existing methods in video action recognition rely on large labeled datasets from the same domain.
We propose a novel cross-domain few-shot video action recognition method that leverages self-supervised learning and curriculum learning.
arXiv Detail & Related papers (2023-09-07T19:44:27Z) - Prompting Diffusion Representations for Cross-Domain Semantic
Segmentation [101.04326113360342]
diffusion-pretraining achieves extraordinary domain generalization results for semantic segmentation.
We introduce a scene prompt and a prompt randomization strategy to help further disentangle the domain-invariant information when training the segmentation head.
arXiv Detail & Related papers (2023-07-05T09:28:25Z) - Accelerating exploration and representation learning with offline
pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset.
We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z) - Feature Diversity Learning with Sample Dropout for Unsupervised Domain
Adaptive Person Re-identification [0.0]
This paper proposes a new approach to learn the feature representation with better generalization ability through limiting noisy pseudo labels.
We put forward a brand-new method referred as to Feature Diversity Learning (FDL) under the classic mutual-teaching architecture.
Experimental results show that our proposed FDL-SD achieves the state-of-the-art performance on multiple benchmark datasets.
arXiv Detail & Related papers (2022-01-25T10:10:48Z) - Clustering augmented Self-Supervised Learning: Anapplication to Land
Cover Mapping [10.720852987343896]
We introduce a new method for land cover mapping by using a clustering based pretext task for self-supervised learning.
We demonstrate the effectiveness of the method on two societally relevant applications.
arXiv Detail & Related papers (2021-08-16T19:35:43Z) - Multi-Pretext Attention Network for Few-shot Learning with
Self-supervision [37.6064643502453]
We propose a novel augmentation-free method for self-supervised learning, which does not rely on any auxiliary sample.
Besides, we propose Multi-pretext Attention Network (MAN), which exploits a specific attention mechanism to combine the traditional augmentation-relied methods and our GC.
We evaluate our MAN extensively on miniImageNet and tieredImageNet datasets and the results demonstrate that the proposed method outperforms the state-of-the-art (SOTA) relevant methods.
arXiv Detail & Related papers (2021-03-10T10:48:37Z) - Self-Supervised Features Improve Open-World Learning [13.880789191591088]
We present an unifying open-world framework combining Incremental Learning, Out-of-Distribution detection and Open-World learning.
Under an unsupervised feature representation, we categorize the problem of detecting unknowns as either Out-of-Label-space or Out-of-Distribution detection.
The incremental learning component of our pipeline is a zero-exemplar online model which performs comparatively against state-of-the-art on ImageNet-100 protocol.
arXiv Detail & Related papers (2021-02-15T21:03:05Z) - Can Semantic Labels Assist Self-Supervised Visual Representation
Learning? [194.1681088693248]
We present a new algorithm named Supervised Contrastive Adjustment in Neighborhood (SCAN)
In a series of downstream tasks, SCAN achieves superior performance compared to previous fully-supervised and self-supervised methods.
Our study reveals that semantic labels are useful in assisting self-supervised methods, opening a new direction for the community.
arXiv Detail & Related papers (2020-11-17T13:25:00Z) - A Review of Single-Source Deep Unsupervised Visual Domain Adaptation [81.07994783143533]
Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks.
In many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data.
To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.
arXiv Detail & Related papers (2020-09-01T00:06:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.