Weakly Supervised Named Entity Tagging with Learnable Logical Rules
- URL: http://arxiv.org/abs/2107.02282v1
- Date: Mon, 5 Jul 2021 21:32:19 GMT
- Title: Weakly Supervised Named Entity Tagging with Learnable Logical Rules
- Authors: Jiacheng Li, Haibo Ding, Jingbo Shang, Julian McAuley, Zhe Feng
- Abstract summary: We study the problem of building entity tagging systems by using a few rules as weak supervision.
We propose a novel method TALLOR that bootstraps high-quality logical rules to train a neural tagger in a fully automated manner.
- Score: 20.90760075357411
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of building entity tagging systems by using a few rules
as weak supervision. Previous methods mostly focus on disambiguation entity
types based on contexts and expert-provided rules, while assuming entity spans
are given. In this work, we propose a novel method TALLOR that bootstraps
high-quality logical rules to train a neural tagger in a fully automated
manner. Specifically, we introduce compound rules that are composed from simple
rules to increase the precision of boundary detection and generate more diverse
pseudo labels. We further design a dynamic label selection strategy to ensure
pseudo label quality and therefore avoid overfitting the neural tagger.
Experiments on three datasets demonstrate that our method outperforms other
weakly supervised methods and even rivals a state-of-the-art distantly
supervised tagger with a lexicon of over 2,000 terms when starting from only 20
simple rules. Our method can serve as a tool for rapidly building taggers in
emerging domains and tasks. Case studies show that learned rules can
potentially explain the predicted entities.
Related papers
- RulePrompt: Weakly Supervised Text Classification with Prompting PLMs and Self-Iterative Logical Rules [30.239044569301534]
Weakly supervised text classification (WSTC) has attracted increasing attention due to its applicability in classifying a mass of texts.
We propose a prompting PLM-based approach named RulePrompt for the WSTC task, consisting of a rule mining module and a rule-enhanced pseudo label generation module.
Our approach yields interpretable category rules, proving its advantage in disambiguating easily-confused categories.
arXiv Detail & Related papers (2024-03-05T12:50:36Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - Losses over Labels: Weakly Supervised Learning via Direct Loss
Construction [71.11337906077483]
Programmable weak supervision is a growing paradigm within machine learning.
We propose Losses over Labels (LoL) as it creates losses directly from ofs without going through the intermediate step of a label.
We show that LoL improves upon existing weak supervision methods on several benchmark text and image classification tasks.
arXiv Detail & Related papers (2022-12-13T22:29:14Z) - Distilling Task-specific Logical Rules from Large Pre-trained Models [24.66436804853525]
We develop a novel framework to distill task-specific logical rules from large pre-trained models.
Specifically, we borrow recent prompt-based language models as the knowledge expert to yield initial seed rules.
Experiments on three public named entity tagging benchmarks demonstrate the effectiveness of our proposed framework.
arXiv Detail & Related papers (2022-10-06T09:12:18Z) - A Weakly Supervised Learning Framework for Salient Object Detection via
Hybrid Labels [96.56299163691979]
This paper focuses on a new weakly-supervised salient object detection (SOD) task under hybrid labels.
To address the issues of label noise and quantity imbalance in this task, we design a new pipeline framework with three sophisticated training strategies.
Experiments on five SOD benchmarks show that our method achieves competitive performance against weakly-supervised/unsupervised methods.
arXiv Detail & Related papers (2022-09-07T06:45:39Z) - Use All The Labels: A Hierarchical Multi-Label Contrastive Learning
Framework [75.79736930414715]
We present a hierarchical multi-label representation learning framework that can leverage all available labels and preserve the hierarchical relationship between classes.
We introduce novel hierarchy preserving losses, which jointly apply a hierarchical penalty to the contrastive loss, and enforce the hierarchy constraint.
arXiv Detail & Related papers (2022-04-27T21:41:44Z) - Fine-Grained Visual Entailment [51.66881737644983]
We propose an extension of this task, where the goal is to predict the logical relationship of fine-grained knowledge elements within a piece of text to an image.
Unlike prior work, our method is inherently explainable and makes logical predictions at different levels of granularity.
We evaluate our method on a new dataset of manually annotated knowledge elements and show that our method achieves 68.18% accuracy at this challenging task.
arXiv Detail & Related papers (2022-03-29T16:09:38Z) - GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised
Named Entity Recognition [8.352789684571704]
We propose textscGLaRA, a graph-based labeling rule augmentation framework, to learn new labeling rules from unlabeled data.
We apply the augmented rules on unlabeled data to generate weak labels and train a NER model using the weakly labeled data.
arXiv Detail & Related papers (2021-04-13T14:20:58Z) - Self-Training with Weak Supervision [32.68342091430266]
State-of-the-art deep neural networks require large-scale labeled training data that is often expensive to obtain or not available for many tasks.
weak supervision in the form of domain-specific rules has been shown to be useful in such settings.
We develop a weak supervision framework (ASTRA) that leverages all the available data for a given task.
arXiv Detail & Related papers (2021-04-12T14:45:04Z) - Towards Learning Instantiated Logical Rules from Knowledge Graphs [20.251630903853016]
We present GPFL, a probabilistic learner rule optimized to mine instantiated first-order logic rules from knowledge graphs.
GPFL utilizes a novel two-stage rule generation mechanism that first generalizes extracted paths into templates that are acyclic abstract rules.
We reveal the presence of overfitting rules, their impact on the predictive performance, and the effectiveness of a simple validation method filtering out overfitting rules.
arXiv Detail & Related papers (2020-03-13T00:32:46Z) - Structured Prediction with Partial Labelling through the Infimum Loss [85.4940853372503]
The goal of weak supervision is to enable models to learn using only forms of labelling which are cheaper to collect.
This is a type of incomplete annotation where, for each datapoint, supervision is cast as a set of labels containing the real one.
This paper provides a unified framework based on structured prediction and on the concept of infimum loss to deal with partial labelling.
arXiv Detail & Related papers (2020-03-02T13:59:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.