Modeling sequential annotations for sequence labeling with crowds
- URL: http://arxiv.org/abs/2209.09430v1
- Date: Tue, 20 Sep 2022 02:51:23 GMT
- Title: Modeling sequential annotations for sequence labeling with crowds
- Authors: Xiaolei Lu, Tommy W.S.Chow
- Abstract summary: Crowd sequential annotations can be an efficient and cost-effective way to build large datasets for sequence labeling.
We propose Modeling sequential annotation for sequence labeling with crowds (SA-SLC)
A valid label sequence inference (VLSE) method is proposed to derive the valid ground-truth label sequences from crowd sequential annotations.
- Score: 8.239028141030621
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crowd sequential annotations can be an efficient and cost-effective way to
build large datasets for sequence labeling. Different from tagging independent
instances, for crowd sequential annotations the quality of label sequence
relies on the expertise level of annotators in capturing internal dependencies
for each token in the sequence. In this paper, we propose Modeling sequential
annotation for sequence labeling with crowds (SA-SLC). First, a conditional
probabilistic model is developed to jointly model sequential data and
annotators' expertise, in which categorical distribution is introduced to
estimate the reliability of each annotator in capturing local and non-local
label dependency for sequential annotation. To accelerate the marginalization
of the proposed model, a valid label sequence inference (VLSE) method is
proposed to derive the valid ground-truth label sequences from crowd sequential
annotations. VLSE derives possible ground-truth labels from the token-wise
level and further prunes sub-paths in the forward inference for label sequence
decoding. VLSE reduces the number of candidate label sequences and improves the
quality of possible ground-truth label sequences. The experimental results on
several sequence labeling tasks of Natural Language Processing show the
effectiveness of the proposed model.
Related papers
- Inaccurate Label Distribution Learning with Dependency Noise [52.08553913094809]
We introduce the Dependent Noise-based Inaccurate Label Distribution Learning (DN-ILDL) framework to tackle the challenges posed by noise in label distribution learning.
We show that DN-ILDL effectively addresses the ILDL problem and outperforms existing LDL methods.
arXiv Detail & Related papers (2024-05-26T07:58:07Z) - Perception and Semantic Aware Regularization for Sequential Confidence
Calibration [12.265757315192497]
We propose a Perception and Semantic aware Sequence Regularization framework.
We introduce a semantic context-free recognition and a language model to acquire similar sequences with high perceptive similarities and semantic correlation.
Experiments on canonical sequence recognition tasks, including scene text and speech recognition, demonstrate that our method sets novel state-of-the-art results.
arXiv Detail & Related papers (2023-05-31T02:16:29Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly
Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels.
Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels.
We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z) - Leveraging Instance Features for Label Aggregation in Programmatic Weak
Supervision [75.1860418333995]
Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently.
The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources as labeling functions.
Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features when modeling the underlying generative process.
arXiv Detail & Related papers (2022-10-06T07:28:53Z) - Automatic Label Sequence Generation for Prompting Sequence-to-sequence
Models [105.4590533269863]
We propose AutoSeq, a fully automatic prompting method.
We adopt natural language prompts on sequence-to-sequence models.
Our method reveals the potential of sequence-to-sequence models in few-shot learning.
arXiv Detail & Related papers (2022-09-20T01:35:04Z) - Partial sequence labeling with structured Gaussian Processes [8.239028141030621]
We propose structured Gaussian Processes for partial sequence labeling.
It encodes uncertainty in the prediction and does not need extra effort for model selection and hyper parameter learning.
It is evaluated on several sequence labeling tasks and the experimental results show the effectiveness of the proposed model.
arXiv Detail & Related papers (2022-09-20T00:56:49Z) - A Label Dependence-aware Sequence Generation Model for Multi-level
Implicit Discourse Relation Recognition [31.179555215952306]
Implicit discourse relation recognition is a challenging but crucial task in discourse analysis.
We propose a Label Dependence-aware Sequence Generation Model (LDSGM) for it.
We develop a mutual learning enhanced training method to exploit the label dependence in a bottomup direction.
arXiv Detail & Related papers (2021-12-22T09:14:03Z) - Enhancing Label Correlation Feedback in Multi-Label Text Classification
via Multi-Task Learning [6.1538971100140145]
We introduce a novel approach with multi-task learning to enhance label correlation feedback.
We propose two auxiliary label co-occurrence prediction tasks to enhance label correlation learning.
arXiv Detail & Related papers (2021-06-06T12:26:14Z) - Interaction Matching for Long-Tail Multi-Label Classification [57.262792333593644]
We present an elegant and effective approach for addressing limitations in existing multi-label classification models.
By performing soft n-gram interaction matching, we match labels with natural language descriptions.
arXiv Detail & Related papers (2020-05-18T15:27:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.