Cross-Lingual Semantic Role Labeling with High-Quality Translated
Training Corpus
- URL: http://arxiv.org/abs/2004.06295v2
- Date: Thu, 7 May 2020 03:43:42 GMT
- Title: Cross-Lingual Semantic Role Labeling with High-Quality Translated
Training Corpus
- Authors: Hao Fei and Meishan Zhang and Donghong Ji
- Abstract summary: Cross-lingual semantic role labeling is one promising way to address the problem.
We propose a novel alternative based on corpus translation, constructing high-quality training datasets for the target languages.
Experimental results on Universal Proposition Bank show that the translation-based method is highly effective.
- Score: 41.031187560839555
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many efforts of research are devoted to semantic role labeling (SRL) which is
crucial for natural language understanding. Supervised approaches have achieved
impressing performances when large-scale corpora are available for
resource-rich languages such as English. While for the low-resource languages
with no annotated SRL dataset, it is still challenging to obtain competitive
performances. Cross-lingual SRL is one promising way to address the problem,
which has achieved great advances with the help of model transferring and
annotation projection. In this paper, we propose a novel alternative based on
corpus translation, constructing high-quality training datasets for the target
languages from the source gold-standard SRL annotations. Experimental results
on Universal Proposition Bank show that the translation-based method is highly
effective, and the automatic pseudo datasets can improve the target-language
SRL performances significantly.
Related papers
- Cross-lingual Back-Parsing: Utterance Synthesis from Meaning Representation for Zero-Resource Semantic Parsing [6.074150063191985]
Cross-Lingual Back-Parsing is a novel data augmentation methodology designed to enhance cross-lingual transfer for semantic parsing.
Our methodology effectively performs cross-lingual data augmentation in challenging zero-resource settings.
arXiv Detail & Related papers (2024-10-01T08:53:38Z) - SSP: Self-Supervised Prompting for Cross-Lingual Transfer to Low-Resource Languages using Large Language Models [23.522223369054437]
Self-Supervised Prompting (SSP) is a novel ICL approach tailored for the zero-labelled cross-lingual transfer (0-CLT) setting.
SSP is based on the key observation that LLMs output more accurate labels if in-context exemplars are from the target language.
SSP strongly outperforms existing SOTA fine-tuned and prompting-based baselines in 0-CLT setup.
arXiv Detail & Related papers (2024-06-27T04:21:59Z) - Cross-Lingual Transfer Robustness to Lower-Resource Languages on Adversarial Datasets [4.653113033432781]
Cross-lingual transfer capabilities of Multilingual Language Models (MLLMs) are investigated.
Our research provides valuable insights into cross-lingual transfer and its implications for NLP applications.
arXiv Detail & Related papers (2024-03-29T08:47:15Z) - Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer [92.80671770992572]
Cross-lingual transfer is a central task in multilingual NLP.
Earlier efforts on this task use parallel corpora, bilingual dictionaries, or other annotated alignment data.
We propose a simple yet effective method, SALT, to improve the zero-shot cross-lingual transfer.
arXiv Detail & Related papers (2023-09-19T19:30:56Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language
Model [58.27176041092891]
Recent research indicates that pretraining cross-lingual language models on large-scale unlabeled texts yields significant performance improvements.
We propose a novel unsupervised feature decomposition method that can automatically extract domain-specific features from the entangled pretrained cross-lingual representations.
Our proposed model leverages mutual information estimation to decompose the representations computed by a cross-lingual model into domain-invariant and domain-specific parts.
arXiv Detail & Related papers (2020-11-23T16:00:42Z) - X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Dataset [18.389328059694037]
In this work, we propose a method to automatically construct an SRL corpus that is parallel in four languages: English, French, German, Spanish, with unified predicate and role annotations that are fully comparable across languages.
We include human-validated test sets that we use to measure the projection quality, and show that projection is denser and more precise than a strong baseline. Finally, we train different SOTA models on our novel corpus for mono- and multilingual SRL, showing that the multilingual annotations improve performance especially for the weaker languages.
arXiv Detail & Related papers (2020-10-05T13:34:20Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - Cross-lingual Semantic Role Labeling with Model Transfer [49.85316125365497]
Cross-lingual semantic role labeling can be achieved by model transfer under the help of universal features.
We propose an end-to-end SRL model that incorporates a variety of universal features and transfer methods.
arXiv Detail & Related papers (2020-08-24T09:37:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.