Multilingual Generative Language Models for Zero-Shot Cross-Lingual
Event Argument Extraction
- URL: http://arxiv.org/abs/2203.08308v1
- Date: Tue, 15 Mar 2022 23:00:32 GMT
- Title: Multilingual Generative Language Models for Zero-Shot Cross-Lingual
Event Argument Extraction
- Authors: Kuan-Hao Huang, I-Hung Hsu, Premkumar Natarajan, Kai-Wei Chang, Nanyun
Peng
- Abstract summary: We present a study on leveraging multilingual pre-trained generative language models for zero-shot cross-lingual event argument extraction (EAE)
By formulating EAE as a language generation task, our method effectively encodes event structures and captures the dependencies between arguments.
Our proposed model finetunes multilingual pre-trained generative language models to generate sentences that fill in the language-agnostic template with arguments extracted from the input passage.
- Score: 80.61458287741131
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a study on leveraging multilingual pre-trained generative language
models for zero-shot cross-lingual event argument extraction (EAE). By
formulating EAE as a language generation task, our method effectively encodes
event structures and captures the dependencies between arguments. We design
language-agnostic templates to represent the event argument structures, which
are compatible with any language, hence facilitating the cross-lingual
transfer. Our proposed model finetunes multilingual pre-trained generative
language models to generate sentences that fill in the language-agnostic
template with arguments extracted from the input passage. The model is trained
on source languages and is then directly applied to target languages for event
argument extraction. Experiments demonstrate that the proposed model
outperforms the current state-of-the-art models on zero-shot cross-lingual EAE.
Comprehensive studies and error analyses are presented to better understand the
advantages and the current limitations of using generative language models for
zero-shot cross-lingual transfer EAE.
Related papers
- Accelerating Multilingual Language Model for Excessively Tokenized Languages [3.5570874721859016]
tokenizers in large language models (LLMs) often fragment a text into character or Unicode-level tokens in non-Roman alphabetic languages.
We introduce a simple yet effective framework to accelerate text generation in such languages.
arXiv Detail & Related papers (2024-01-19T12:26:57Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Modeling Sequential Sentence Relation to Improve Cross-lingual Dense
Retrieval [87.11836738011007]
We propose a multilingual multilingual language model called masked sentence model (MSM)
MSM consists of a sentence encoder to generate the sentence representations, and a document encoder applied to a sequence of sentence vectors from a document.
To train the model, we propose a masked sentence prediction task, which masks and predicts the sentence vector via a hierarchical contrastive loss with sampled negatives.
arXiv Detail & Related papers (2023-02-03T09:54:27Z) - Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
Multilingual Language Models [73.11488464916668]
This study investigates the dynamics of the multilingual pretraining process.
We probe checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks.
Our analysis shows that the model achieves high in-language performance early on, with lower-level linguistic skills acquired before more complex ones.
arXiv Detail & Related papers (2022-05-24T03:35:00Z) - Language Model Priming for Cross-Lingual Event Extraction [1.8734449181723827]
We present a novel, language-agnostic approach to "priming" language models for the task of event extraction.
We show that by enabling the language model to better compensate for the deficits of sparse and noisy training data, our approach improves both trigger and argument detection and classification significantly over the state of the art in a zero-shot cross-lingual setting.
arXiv Detail & Related papers (2021-09-25T15:19:32Z) - Language Models are Few-shot Multilingual Learners [66.11011385895195]
We evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages.
We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones.
arXiv Detail & Related papers (2021-09-16T03:08:22Z) - Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language
Model [58.27176041092891]
Recent research indicates that pretraining cross-lingual language models on large-scale unlabeled texts yields significant performance improvements.
We propose a novel unsupervised feature decomposition method that can automatically extract domain-specific features from the entangled pretrained cross-lingual representations.
Our proposed model leverages mutual information estimation to decompose the representations computed by a cross-lingual model into domain-invariant and domain-specific parts.
arXiv Detail & Related papers (2020-11-23T16:00:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.