MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information
Extraction
- URL: http://arxiv.org/abs/2308.06552v2
- Date: Wed, 20 Sep 2023 14:37:38 GMT
- Title: MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information
Extraction
- Authors: Tongliang Li, Zixiang Wang, Linzheng Chai, Jian Yang, Jiaqi Bai, Yuwei
Yin, Jiaheng Liu, Hongcheng Guo, Liqun Yang, Hebboul Zine el-abidine, Zhoujun
Li
- Abstract summary: Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages.
Previous work uses a shared cross-lingual pre-trained model to handle the different languages but underuses the potential of the language-specific representation.
We propose an effective multi-stage tuning framework called MT4CrossIE, designed for enhancing cross-lingual open information extraction.
- Score: 38.88339164947934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-lingual open information extraction aims to extract structured
information from raw text across multiple languages. Previous work uses a
shared cross-lingual pre-trained model to handle the different languages but
underuses the potential of the language-specific representation. In this paper,
we propose an effective multi-stage tuning framework called MT4CrossIE,
designed for enhancing cross-lingual open information extraction by injecting
language-specific knowledge into the shared model. Specifically, the
cross-lingual pre-trained model is first tuned in a shared semantic space
(e.g., embedding matrix) in the fixed encoder and then other components are
optimized in the second stage. After enough training, we freeze the pre-trained
model and tune the multiple extra low-rank language-specific modules using
mixture-of-LoRAs for model-based cross-lingual transfer. In addition, we
leverage two-stage prompting to encourage the large language model (LLM) to
annotate the multi-lingual raw data for data-based cross-lingual transfer. The
model is trained with multi-lingual objectives on our proposed dataset
OpenIE4++ by combing the model-based and data-based transfer techniques.
Experimental results on various benchmarks emphasize the importance of
aggregating multiple plug-in-and-play language-specific modules and demonstrate
the effectiveness of MT4CrossIE in cross-lingual
OIE\footnote{\url{https://github.com/CSJianYang/Multilingual-Multimodal-NLP}}.
Related papers
- Distilling Efficient Language-Specific Models for Cross-Lingual Transfer [75.32131584449786]
Massively multilingual Transformers (MMTs) are widely used for cross-lingual transfer learning.
MMTs' language coverage makes them unnecessarily expensive to deploy in terms of model size, inference time, energy, and hardware cost.
We propose to extract compressed, language-specific models from MMTs which retain the capacity of the original MMTs for cross-lingual transfer.
arXiv Detail & Related papers (2023-06-02T17:31:52Z) - Multilingual Multimodal Learning with Machine Translated Text [27.7207234512674]
We investigate whether machine translating English multimodal data can be an effective proxy for the lack of readily available multilingual data.
We propose two metrics for automatically removing such translations from the resulting datasets.
In experiments on five tasks across 20 languages in the IGLUE benchmark, we show that translated data can provide a useful signal for multilingual multimodal learning.
arXiv Detail & Related papers (2022-10-24T11:41:20Z) - Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal
Pre-training [21.017471684853987]
We introduce Cross-View Language Modeling, a simple and effective pre-training framework that unifies cross-lingual and cross-modal pre-training.
Our approach is motivated by a key observation that cross-lingual and cross-modal pre-training share the same goal of aligning two different views of the same object into a common semantic space.
CLM is the first multi-lingual multi-modal pre-trained model that surpasses the translate-test performance of representative English vision-language models by zero-shot cross-lingual transfer.
arXiv Detail & Related papers (2022-06-01T16:45:24Z) - Cross-Lingual Text Classification with Multilingual Distillation and
Zero-Shot-Aware Training [21.934439663979663]
Multi-branch multilingual language model (MBLM) built on Multilingual pre-trained language models (MPLMs)
Method based on transferring knowledge from high-performance monolingual models with a teacher-student framework.
Results on two cross-lingual classification tasks show that, with only the task's supervised data used, our method improves both the supervised and zero-shot performance of MPLMs.
arXiv Detail & Related papers (2022-02-28T09:51:32Z) - A Multilingual Bag-of-Entities Model for Zero-Shot Cross-Lingual Text
Classification [16.684856745734944]
We present a multilingual bag-of-entities model that boosts the performance of zero-shot cross-lingual text classification.
It leverages the multilingual nature of Wikidata: entities in multiple languages representing the same concept are defined with a unique identifier.
A model trained on entity features in a resource-rich language can thus be directly applied to other languages.
arXiv Detail & Related papers (2021-10-15T01:10:50Z) - Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking [84.50302759362698]
We enhance the transfer learning process by intermediate fine-tuning of pretrained multilingual models.
We use parallel and conversational movie subtitles datasets to design cross-lingual intermediate tasks.
We achieve impressive improvements (> 20% on goal accuracy) on the parallel MultiWoZ dataset and Multilingual WoZ dataset.
arXiv Detail & Related papers (2021-09-28T11:22:38Z) - UC2: Universal Cross-lingual Cross-modal Vision-and-Language
Pre-training [52.852163987208826]
UC2 is the first machine translation-augmented framework for cross-lingual cross-modal representation learning.
We propose two novel pre-training tasks, namely Masked Region-to-Token Modeling (MRTM) and Visual Translation Language Modeling (VTLM)
Our proposed framework achieves new state-of-the-art on diverse non-English benchmarks while maintaining comparable performance to monolingual pre-trained models on English tasks.
arXiv Detail & Related papers (2021-04-01T08:30:53Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.