SkillNet-X: A Multilingual Multitask Model with Sparsely Activated
Skills
- URL: http://arxiv.org/abs/2306.16176v1
- Date: Wed, 28 Jun 2023 12:53:30 GMT
- Title: SkillNet-X: A Multilingual Multitask Model with Sparsely Activated
Skills
- Authors: Zhangyin Feng, Yong Dai, Fan Zhang, Duyu Tang, Xiaocheng Feng,
Shuangzhi Wu, Bing Qin, Yunbo Cao and Shuming Shi
- Abstract summary: This paper proposes a general multilingual multitask model, named SkillNet-X.
We define several language-specific skills and task-specific skills, each of which corresponds to a skill module.
We evaluate SkillNet-X on eleven natural language understanding datasets in four languages.
- Score: 51.74947795895178
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional multitask learning methods basically can only exploit common
knowledge in task- or language-wise, which lose either cross-language or
cross-task knowledge. This paper proposes a general multilingual multitask
model, named SkillNet-X, which enables a single model to tackle many different
tasks from different languages. To this end, we define several
language-specific skills and task-specific skills, each of which corresponds to
a skill module. SkillNet-X sparsely activates parts of the skill modules which
are relevant either to the target task or the target language. Acting as
knowledge transit hubs, skill modules are capable of absorbing task-related
knowledge and language-related knowledge consecutively. Based on Transformer,
we modify the multi-head attention layer and the feed forward network layer to
accommodate skill modules. We evaluate SkillNet-X on eleven natural language
understanding datasets in four languages. Results show that SkillNet-X performs
better than task-specific baselines and two multitask learning baselines (i.e.,
dense joint model and Mixture-of-Experts model). Furthermore, skill
pre-training further improves the performance of SkillNet-X on almost all
datasets. To investigate the generalization of our model, we conduct
experiments on two new tasks and find that SkillNet-X significantly outperforms
baselines.
Related papers
- FonMTL: Towards Multitask Learning for the Fon Language [1.9370453715137865]
We present the first explorative approach to multitask learning, for model capabilities enhancement in Natural Language Processing for the Fon language.
We leverage two language model heads as encoders to build shared representations for the inputs, and we use linear layers blocks for classification relative to each task.
Our results on the NER and POS tasks for Fon, show competitive (or better) performances compared to several multilingual pretrained language models finetuned on single tasks.
arXiv Detail & Related papers (2023-08-28T03:26:21Z) - One Model, Multiple Tasks: Pathways for Natural Language Understanding [34.58880663537492]
This paper presents a Pathways approach to handle many tasks at once.
Unlike prevailing single-purpose models that overspecialize at individual tasks and learn from scratch when being extended to new tasks, our approach is general-purpose with the ability of stitching together existing skills to learn new tasks more effectively.
arXiv Detail & Related papers (2022-03-07T11:48:09Z) - Prix-LM: Pretraining for Multilingual Knowledge Base Construction [59.02868906044296]
We propose a unified framework, Prix-LM, for multilingual knowledge construction and completion.
We leverage two types of knowledge, monolingual triples and cross-lingual links, extracted from existing multilingual KBs.
Experiments on standard entity-related tasks, such as link prediction in multiple languages, cross-lingual entity linking and bilingual lexicon induction, demonstrate its effectiveness.
arXiv Detail & Related papers (2021-10-16T02:08:46Z) - XLM-K: Improving Cross-Lingual Language Model Pre-Training with
Multilingual Knowledge [31.765178013933134]
Cross-lingual pre-training has achieved great successes using monolingual and bilingual plain text corpora.
We propose XLM-K, a cross-lingual language model incorporating multilingual knowledge in pre-training.
arXiv Detail & Related papers (2021-09-26T11:46:20Z) - XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation [93.80733419450225]
This paper analyzes the current state of cross-lingual transfer learning.
We extend XTREME to XTREME-R, which consists of an improved set of ten natural language understanding tasks.
arXiv Detail & Related papers (2021-04-15T12:26:12Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - Meta-Learning for Effective Multi-task and Multilingual Modelling [23.53779501937046]
We propose a meta-learning approach to learn the interactions between both tasks and languages.
We present experiments on five different tasks and six different languages from the XTREME multilingual benchmark dataset.
arXiv Detail & Related papers (2021-01-25T19:30:26Z) - CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
Cross-Lingual NLP [68.2650714613869]
We propose a data augmentation framework to generate multi-lingual code-switching data to fine-tune mBERT.
Compared with the existing work, our method does not rely on bilingual sentences for training, and requires only one training process for multiple target languages.
arXiv Detail & Related papers (2020-06-11T13:15:59Z) - XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating
Cross-lingual Generalization [128.37244072182506]
Cross-lingual TRansfer Evaluation of Multilinguals XTREME is a benchmark for evaluating the cross-lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks.
We demonstrate that while models tested on English reach human performance on many tasks, there is still a sizable gap in the performance of cross-lingually transferred models.
arXiv Detail & Related papers (2020-03-24T19:09:37Z) - Zero-Shot Cross-Lingual Transfer with Meta Learning [45.29398184889296]
We consider the setting of training models on multiple languages at the same time, when little or no data is available for languages other than English.
We show that this challenging setup can be approached using meta-learning.
We experiment using standard supervised, zero-shot cross-lingual, as well as few-shot cross-lingual settings for different natural language understanding tasks.
arXiv Detail & Related papers (2020-03-05T16:07:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.