Mutlitask Learning for Cross-Lingual Transfer of Semantic Dependencies
- URL: http://arxiv.org/abs/2004.14961v1
- Date: Thu, 30 Apr 2020 17:09:51 GMT
- Title: Mutlitask Learning for Cross-Lingual Transfer of Semantic Dependencies
- Authors: Maryam Aminian, Mohammad Sadegh Rasooli, Mona Diab
- Abstract summary: We develop broad-coverage semantic dependencys for languages with no semantically annotated resource.
We leverage a multitask learning framework coupled with an annotation projection method.
We show that our best multitask model improves the labeled F1 score over the single-task baseline by 1.8 in the in-domain SemEval data.
- Score: 21.503766432869437
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We describe a method for developing broad-coverage semantic dependency
parsers for languages for which no semantically annotated resource is
available. We leverage a multitask learning framework coupled with an
annotation projection method. We transfer supervised semantic dependency parse
annotations from a rich-resource language to a low-resource language through
parallel data, and train a semantic parser on projected data. We make use of
supervised syntactic parsing as an auxiliary task in a multitask learning
framework, and show that with different multitask learning settings, we
consistently improve over the single-task baseline. In the setting in which
English is the source, and Czech is the target language, our best multitask
model improves the labeled F1 score over the single-task baseline by 1.8 in the
in-domain SemEval data (Oepen et al., 2015), as well as 2.5 in the
out-of-domain test set. Moreover, we observe that syntactic and semantic
dependency direction match is an important factor in improving the results.
Related papers
- CUNI Submission to MRL 2023 Shared Task on Multi-lingual Multi-task
Information Retrieval [5.97515243922116]
We present the Charles University system for the MRL2023 Shared Task on Multi-lingual Multi-task Information Retrieval.
The goal of the shared task was to develop systems for named entity recognition and question answering in several under-represented languages.
Our solutions to both subtasks rely on the translate-test approach.
arXiv Detail & Related papers (2023-10-25T10:22:49Z) - FonMTL: Towards Multitask Learning for the Fon Language [1.9370453715137865]
We present the first explorative approach to multitask learning, for model capabilities enhancement in Natural Language Processing for the Fon language.
We leverage two language model heads as encoders to build shared representations for the inputs, and we use linear layers blocks for classification relative to each task.
Our results on the NER and POS tasks for Fon, show competitive (or better) performances compared to several multilingual pretrained language models finetuned on single tasks.
arXiv Detail & Related papers (2023-08-28T03:26:21Z) - Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - Multilingual Word Sense Disambiguation with Unified Sense Representation [55.3061179361177]
We propose building knowledge and supervised-based Multilingual Word Sense Disambiguation (MWSD) systems.
We build unified sense representations for multiple languages and address the annotation scarcity problem for MWSD by transferring annotations from rich-sourced languages to poorer ones.
Evaluations of SemEval-13 and SemEval-15 datasets demonstrate the effectiveness of our methodology.
arXiv Detail & Related papers (2022-10-14T01:24:03Z) - Incorporating Linguistic Knowledge for Abstractive Multi-document
Summarization [20.572283625521784]
We develop a neural network based abstractive multi-document summarization (MDS) model.
We process the dependency information into the linguistic-guided attention mechanism.
With the help of linguistic signals, sentence-level relations can be correctly captured.
arXiv Detail & Related papers (2021-09-23T08:13:35Z) - Structured Prediction as Translation between Augmented Natural Languages [109.50236248762877]
We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks.
Instead of tackling the problem by training task-specific discriminatives, we frame it as a translation task between augmented natural languages.
Our approach can match or outperform task-specific models on all tasks, and in particular, achieves new state-of-the-art results on joint entity and relation extraction.
arXiv Detail & Related papers (2021-01-14T18:32:21Z) - Cross-lingual Dependency Parsing as Domain Adaptation [48.69930912510414]
Cross-lingual transfer learning is as essential as in-domain learning.
We use the ability of a pre-training task that extracts universal features without supervision.
We combine the traditional self-training and the two pre-training tasks.
arXiv Detail & Related papers (2020-12-24T08:14:36Z) - Multilingual Irony Detection with Dependency Syntax and Neural Models [61.32653485523036]
It focuses on the contribution from syntactic knowledge, exploiting linguistic resources where syntax is annotated according to the Universal Dependencies scheme.
The results suggest that fine-grained dependency-based syntactic information is informative for the detection of irony.
arXiv Detail & Related papers (2020-11-11T11:22:05Z) - Hierarchical Multi Task Learning with Subword Contextual Embeddings for
Languages with Rich Morphology [5.5217350574838875]
Morphological information is important for many sequence labeling tasks in Natural Language Processing (NLP)
We propose using subword contextual embeddings to capture morphological information for languages with rich morphology.
Our model outperforms previous state-of-the-art models on both tasks for the Turkish language.
arXiv Detail & Related papers (2020-04-25T22:55:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.