Sparse Conditional Hidden Markov Model for Weakly Supervised Named
Entity Recognition
- URL: http://arxiv.org/abs/2205.14228v1
- Date: Fri, 27 May 2022 20:47:30 GMT
- Title: Sparse Conditional Hidden Markov Model for Weakly Supervised Named
Entity Recognition
- Authors: Yinghao Li, Le Song, Chao Zhang
- Abstract summary: We propose the sparse conditional hidden Markov model (Sparse-CHMM) to evaluate noisy labeling functions.
Sparse-CHMM is optimized through unsupervised learning with a three-stage training pipeline.
It achieves a 3.01 average F1 score improvement on five comprehensive datasets.
- Score: 68.68300358332156
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Weakly supervised named entity recognition methods train label models to
aggregate the token annotations of multiple noisy labeling functions (LFs)
without seeing any manually annotated labels. To work well, the label model
needs to contextually identify and emphasize well-performed LFs while
down-weighting the under-performers. However, evaluating the LFs is challenging
due to the lack of ground truths. To address this issue, we propose the sparse
conditional hidden Markov model (Sparse-CHMM). Instead of predicting the entire
emission matrix as other HMM-based methods, Sparse-CHMM focuses on estimating
its diagonal elements, which are considered as the reliability scores of the
LFs. The sparse scores are then expanded to the full-fledged emission matrix
with pre-defined expansion functions. We also augment the emission with
weighted XOR scores, which track the probabilities of an LF observing incorrect
entities. Sparse-CHMM is optimized through unsupervised learning with a
three-stage training pipeline that reduces the training difficulty and prevents
the model from falling into local optima. Compared with the baselines in the
Wrench benchmark, Sparse-CHMM achieves a 3.01 average F1 score improvement on
five comprehensive datasets. Experiments show that each component of
Sparse-CHMM is effective, and the estimated LF reliabilities strongly correlate
with true LF F1 scores.
Related papers
- Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation [8.046705062670096]
Lossless speculative decoding accelerates target large language model inference.
We propose FSPAD (Feature Sampling and Partial Alignment Distillation for Lossless Speculative Decoding) to boost speculative decoding.
Our experiments include both greedy and non-greedy decoding on the largest and smallest models from the Vicuna and LLaMA3-Instruct series.
arXiv Detail & Related papers (2024-08-28T06:28:01Z) - EOL: Transductive Few-Shot Open-Set Recognition by Enhancing Outlier Logits [16.081748213657825]
In Few-Shot Learning, models are trained to recognise unseen objects from a query set, given a few labelled examples from a support set.
In this work, we explore the more nuanced and practical challenge of Open-Set Few-Shot Recognition.
arXiv Detail & Related papers (2024-08-04T15:00:22Z) - Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score.
Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score.
Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z) - Self-Evaluation Improves Selective Generation in Large Language Models [54.003992911447696]
We reformulate open-ended generation tasks into token-level prediction tasks.
We instruct an LLM to self-evaluate its answers.
We benchmark a range of scoring methods based on self-evaluation.
arXiv Detail & Related papers (2023-12-14T19:09:22Z) - Ground Truth Inference for Weakly Supervised Entity Matching [76.6732856489872]
We propose a simple but powerful labeling model for weak supervision tasks.
We then tailor the labeling model specifically to the task of entity matching.
We show that our labeling model results in a 9% higher F1 score on average than the best existing method.
arXiv Detail & Related papers (2022-11-13T17:57:07Z) - ULF: Unsupervised Labeling Function Correction using Cross-Validation
for Weak Supervision [5.566060402907773]
Weak supervision (WS) is a cost-effective alternative to manual data labeling.
We introduce a new algorithm ULF for Unsupervised Labeling Function correction.
ULF refines the allocation of LFs to classes by re-estimating this assignment on highly reliable cross-validated samples.
arXiv Detail & Related papers (2022-04-14T10:29:01Z) - Learning to Robustly Aggregate Labeling Functions for Semi-supervised
Data Programming [14.639568384768042]
A critical bottleneck in supervised machine learning is the need for large amounts of labeled data.
In this work, we propose an LF based reweighting framework ouralgo to solve these two critical limitations.
Our algorithm learns a joint model on the (same) labeled dataset used for LF induction along with any unlabeled data in a semi-supervised manner.
arXiv Detail & Related papers (2021-09-23T14:42:46Z) - OpenMatch: Open-set Consistency Regularization for Semi-supervised
Learning with Outliers [71.08167292329028]
We propose a novel Open-set Semi-Supervised Learning (OSSL) approach called OpenMatch.
OpenMatch unifies FixMatch with novelty detection based on one-vs-all (OVA) classifiers.
It achieves state-of-the-art performance on three datasets, and even outperforms a fully supervised model in detecting outliers unseen in unlabeled data on CIFAR10.
arXiv Detail & Related papers (2021-05-28T23:57:15Z) - BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised
Named Entity Recognition [57.2201011783393]
conditional hidden Markov model (CHMM)
CHMM predicts token-wise transition and emission probabilities from the BERT embeddings of the input tokens.
It fine-tunes a BERT-based NER model with the labels inferred by CHMM.
arXiv Detail & Related papers (2021-05-26T21:18:48Z) - Semi-supervised Learning Meets Factorization: Learning to Recommend with
Chain Graph Model [16.007141894770054]
latent factor model (LFM) has been drawing much attention in recommender systems due to its good performance and scalability.
Semi-supervised learning (SSL) provides an effective way to alleviate the label (i.e., rating) sparsity problem.
We propose a novel probabilistic chain graph model (CGM) to marry SSL with LFM.
arXiv Detail & Related papers (2020-03-05T06:34:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.