Improving Self-training for Cross-lingual Named Entity Recognition with
Contrastive and Prototype Learning
- URL: http://arxiv.org/abs/2305.13628v2
- Date: Sun, 4 Jun 2023 16:32:41 GMT
- Title: Improving Self-training for Cross-lingual Named Entity Recognition with
Contrastive and Prototype Learning
- Authors: Ran Zhou, Xin Li, Lidong Bing, Erik Cambria, Chunyan Miao
- Abstract summary: In cross-lingual named entity recognition, self-training is commonly used to bridge the linguistic gap.
In this work, we aim to improve self-training for cross-lingual NER by combining representation learning and pseudo label refinement.
Our proposed method, namely ContProto mainly comprises two components: (1) contrastive self-training and (2) prototype-based pseudo-labeling.
- Score: 80.08139343603956
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In cross-lingual named entity recognition (NER), self-training is commonly
used to bridge the linguistic gap by training on pseudo-labeled target-language
data. However, due to sub-optimal performance on target languages, the pseudo
labels are often noisy and limit the overall performance. In this work, we aim
to improve self-training for cross-lingual NER by combining representation
learning and pseudo label refinement in one coherent framework. Our proposed
method, namely ContProto mainly comprises two components: (1) contrastive
self-training and (2) prototype-based pseudo-labeling. Our contrastive
self-training facilitates span classification by separating clusters of
different classes, and enhances cross-lingual transferability by producing
closely-aligned representations between the source and target language.
Meanwhile, prototype-based pseudo-labeling effectively improves the accuracy of
pseudo labels during training. We evaluate ContProto on multiple transfer
pairs, and experimental results show our method brings in substantial
improvements over current state-of-the-art methods.
Related papers
- mCL-NER: Cross-Lingual Named Entity Recognition via Multi-view
Contrastive Learning [54.523172171533645]
Cross-lingual named entity recognition (CrossNER) faces challenges stemming from uneven performance due to the scarcity of multilingual corpora.
We propose Multi-view Contrastive Learning for Cross-lingual Named Entity Recognition (mCL-NER)
Our experiments on the XTREME benchmark, spanning 40 languages, demonstrate the superiority of mCL-NER over prior data-driven and model-based approaches.
arXiv Detail & Related papers (2023-08-17T16:02:29Z) - VECO 2.0: Cross-lingual Language Model Pre-training with
Multi-granularity Contrastive Learning [56.47303426167584]
We propose a cross-lingual pre-trained model VECO2.0 based on contrastive learning with multi-granularity alignments.
Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs.
token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance.
arXiv Detail & Related papers (2023-04-17T12:23:41Z) - ConNER: Consistency Training for Cross-lingual Named Entity Recognition [96.84391089120847]
Cross-lingual named entity recognition suffers from data scarcity in the target languages.
We propose ConNER as a novel consistency training framework for cross-lingual NER.
arXiv Detail & Related papers (2022-11-17T07:57:54Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - Bridging the Gap between Language Models and Cross-Lingual Sequence
Labeling [101.74165219364264]
Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks.
Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and fine-tuning stages.
In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap.
Second, we present ContrAstive-Consistency Regularization (CACR), which utilizes contrastive learning to encourage the consistency between representations of input parallel
arXiv Detail & Related papers (2022-04-11T15:55:20Z) - PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised
Object Detection [42.75316070378037]
We propose Noisy Pseudo box Learning (NPL) that includes Prediction-guided Label Assignment (PLA) and Positive-proposal Consistency Voting (PCV)
On benchmark, our method, PSEudo labeling and COnsistency training (PseCo), outperforms the SOTA (Soft Teacher) by 2.0, 1.8, 2.0 points under 1%, 5%, and 10% labelling ratios.
arXiv Detail & Related papers (2022-03-30T13:59:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.