Constrained Decoding for Cross-lingual Label Projection
- URL: http://arxiv.org/abs/2402.03131v1
- Date: Mon, 5 Feb 2024 15:57:32 GMT
- Title: Constrained Decoding for Cross-lingual Label Projection
- Authors: Duong Minh Le, Yang Chen, Alan Ritter, Wei Xu
- Abstract summary: Cross-lingual transfer using multilingual LLMs has become a popular learning paradigm for low-resource languages with no labeled training data.
However, for NLP tasks that involve fine-grained predictions on words and phrases, the performance of zero-shot cross-lingual transfer learning lags far behind supervised fine-tuning methods.
- Score: 27.567195418950966
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Zero-shot cross-lingual transfer utilizing multilingual LLMs has become a
popular learning paradigm for low-resource languages with no labeled training
data. However, for NLP tasks that involve fine-grained predictions on words and
phrases, the performance of zero-shot cross-lingual transfer learning lags far
behind supervised fine-tuning methods. Therefore, it is common to exploit
translation and label projection to further improve the performance by (1)
translating training data that is available in a high-resource language (e.g.,
English) together with the gold labels into low-resource languages, and/or (2)
translating test data in low-resource languages to a high-source language to
run inference on, then projecting the predicted span-level labels back onto the
original test data. However, state-of-the-art marker-based label projection
methods suffer from translation quality degradation due to the extra label
markers injected in the input to the translation model. In this work, we
explore a new direction that leverages constrained decoding for label
projection to overcome the aforementioned issues. Our new method not only can
preserve the quality of translated texts but also has the versatility of being
applicable to both translating training and translating test data strategies.
This versatility is crucial as our experiments reveal that translating test
data can lead to a considerable boost in performance compared to translating
only training data. We evaluate on two cross-lingual transfer tasks, namely
Named Entity Recognition and Event Argument Extraction, spanning 20 languages.
The results demonstrate that our approach outperforms the state-of-the-art
marker-based method by a large margin and also shows better performance than
other label projection methods that rely on external word alignment.
Related papers
- Contextual Label Projection for Cross-Lingual Structured Prediction [103.55999471155104]
CLaP translates text to the target language and performs contextual translation on the labels using the translated text as the context.
We benchmark CLaP with other label projection techniques on zero-shot cross-lingual transfer across 39 languages.
arXiv Detail & Related papers (2023-09-16T10:27:28Z) - T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text
Classification [50.675552118811]
Cross-lingual text classification is typically built on large-scale, multilingual language models (LMs) pretrained on a variety of languages of interest.
We propose revisiting the classic "translate-and-test" pipeline to neatly separate the translation and classification stages.
arXiv Detail & Related papers (2023-06-08T07:33:22Z) - Frustratingly Easy Label Projection for Cross-lingual Transfer [25.398772204761215]
A few efforts have utilized a simple mark-then-translate method to jointly perform translation and projection.
We present an empirical study across 57 languages and three tasks (QA, NER, and Event Extraction) to evaluate the effectiveness and limitations of both methods.
Our optimized version of mark-then-translate, which we call EasyProject, is easily applied to many languages and works surprisingly well, outperforming the more complex word alignment-based methods.
arXiv Detail & Related papers (2022-11-28T18:11:48Z) - Model and Data Transfer for Cross-Lingual Sequence Labelling in
Zero-Resource Settings [10.871587311621974]
We experimentally demonstrate that high capacity multilingual language models applied in a zero-shot setting consistently outperform data-based cross-lingual transfer approaches.
A detailed analysis of our results suggests that this might be due to important differences in language use.
Our results also indicate that data-based cross-lingual transfer approaches remain a competitive option when high-capacity multilingual language models are not available.
arXiv Detail & Related papers (2022-10-23T05:37:35Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - A Dual-Contrastive Framework for Low-Resource Cross-Lingual Named Entity
Recognition [5.030581940990434]
Cross-lingual Named Entity Recognition (NER) has recently become a research hotspot because it can alleviate the data-hungry problem for low-resource languages.
In this paper, we describe our novel dual-contrastive framework ConCNER for cross-lingual NER under the scenario of limited source-language labeled data.
arXiv Detail & Related papers (2022-04-02T07:59:13Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - On the Language Coverage Bias for Neural Machine Translation [81.81456880770762]
Language coverage bias is important for neural machine translation (NMT) because the target-original training data is not well exploited in current practice.
By carefully designing experiments, we provide comprehensive analyses of the language coverage bias in the training data.
We propose two simple and effective approaches to alleviate the language coverage bias problem.
arXiv Detail & Related papers (2021-06-07T01:55:34Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual
Named Entity Recognition with Unlabeled Data [28.8970132244542]
We propose a novel approach termed UniTrans to Unify both model and data Transfer for cross-lingual NER.
We evaluate our proposed UniTrans over 4 target languages on benchmark datasets.
arXiv Detail & Related papers (2020-07-15T13:46:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.