Named Entity Recognition without Labelled Data: A Weak Supervision
Approach
- URL: http://arxiv.org/abs/2004.14723v1
- Date: Thu, 30 Apr 2020 12:29:55 GMT
- Title: Named Entity Recognition without Labelled Data: A Weak Supervision
Approach
- Authors: Pierre Lison, Aliaksandr Hubin, Jeremy Barnes, and Samia Touileb
- Abstract summary: This paper presents a simple but powerful approach to learn NER models in the absence of labelled data through weak supervision.
The approach relies on a broad spectrum of labelling functions to automatically annotate texts from the target domain.
A sequence labelling model can finally be trained on the basis of this unified annotation.
- Score: 23.05371427663683
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Named Entity Recognition (NER) performance often degrades rapidly when
applied to target domains that differ from the texts observed during training.
When in-domain labelled data is available, transfer learning techniques can be
used to adapt existing NER models to the target domain. But what should one do
when there is no hand-labelled data for the target domain? This paper presents
a simple but powerful approach to learn NER models in the absence of labelled
data through weak supervision. The approach relies on a broad spectrum of
labelling functions to automatically annotate texts from the target domain.
These annotations are then merged together using a hidden Markov model which
captures the varying accuracies and confusions of the labelling functions. A
sequence labelling model can finally be trained on the basis of this unified
annotation. We evaluate the approach on two English datasets (CoNLL 2003 and
news articles from Reuters and Bloomberg) and demonstrate an improvement of
about 7 percentage points in entity-level $F_1$ scores compared to an
out-of-domain neural NER model.
Related papers
- Cross-domain Named Entity Recognition via Graph Matching [25.237288970802425]
Cross-domain NER is a practical yet challenging problem since the data scarcity in the real-world scenario.
We model the label relationship as a probability distribution and construct label graphs in both source and target label spaces.
By representing label relationships as graphs, we formulate cross-domain NER as a graph matching problem.
arXiv Detail & Related papers (2024-08-02T02:31:54Z) - SpanProto: A Two-stage Span-based Prototypical Network for Few-shot
Named Entity Recognition [45.012327072558975]
Few-shot Named Entity Recognition (NER) aims to identify named entities with very little annotated data.
We propose a seminal span-based prototypical network (SpanProto) that tackles few-shot NER via a two-stage approach.
In the span extraction stage, we transform the sequential tags into a global boundary matrix, enabling the model to focus on the explicit boundary information.
For mention classification, we leverage prototypical learning to capture the semantic representations for each labeled span and make the model better adapt to novel-class entities.
arXiv Detail & Related papers (2022-10-17T12:59:33Z) - Robust Target Training for Multi-Source Domain Adaptation [110.77704026569499]
We propose a novel Bi-level Optimization based Robust Target Training (BORT$2$) method for MSDA.
Our proposed method achieves the state of the art performance on three MSDA benchmarks, including the large-scale DomainNet dataset.
arXiv Detail & Related papers (2022-10-04T15:20:01Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation [85.6961770631173]
In semi-supervised domain adaptation, a few labeled samples per class in the target domain guide features of the remaining target samples to aggregate around them.
We propose a novel approach called Cross-domain Adaptive Clustering to address this problem.
arXiv Detail & Related papers (2021-04-19T16:07:32Z) - Inferring Latent Domains for Unsupervised Deep Domain Adaptation [54.963823285456925]
Unsupervised Domain Adaptation (UDA) refers to the problem of learning a model in a target domain where labeled data are not available.
This paper introduces a novel deep architecture which addresses the problem of UDA by automatically discovering latent domains in visual datasets.
We evaluate our approach on publicly available benchmarks, showing that it outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2021-03-25T14:33:33Z) - Unsupervised Domain Adaptation for Person Re-Identification through
Source-Guided Pseudo-Labeling [2.449909275410288]
Person Re-Identification (re-ID) aims at retrieving images of the same person taken by different cameras.
Unsupervised Domain Adaptation (UDA) is an interesting research direction for this challenge as it avoids a costly annotation of the target data.
We introduce a framework which relies on a two-branch architecture optimizing classification and triplet loss based metric learning in source and target domains.
arXiv Detail & Related papers (2020-09-20T14:54:42Z) - Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive
Person Re-Identification [64.37745443119942]
This paper jointly enforces visual and temporal consistency in the combination of a local one-hot classification and a global multi-class classification.
Experimental results on three large-scale ReID datasets demonstrate the superiority of proposed method in both unsupervised and unsupervised domain adaptive ReID tasks.
arXiv Detail & Related papers (2020-07-21T14:31:27Z) - Zero-Resource Cross-Domain Named Entity Recognition [68.83177074227598]
Existing models for cross-domain named entity recognition rely on numerous unlabeled corpus or labeled NER training data in target domains.
We propose a cross-domain NER model that does not use any external resources.
arXiv Detail & Related papers (2020-02-14T09:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.