Hierarchical Multi Task Learning with Subword Contextual Embeddings for
Languages with Rich Morphology
- URL: http://arxiv.org/abs/2004.12247v1
- Date: Sat, 25 Apr 2020 22:55:56 GMT
- Title: Hierarchical Multi Task Learning with Subword Contextual Embeddings for
Languages with Rich Morphology
- Authors: Arda Akdemir and Tetsuo Shibuya and Tunga G\"ung\"or
- Abstract summary: Morphological information is important for many sequence labeling tasks in Natural Language Processing (NLP)
We propose using subword contextual embeddings to capture morphological information for languages with rich morphology.
Our model outperforms previous state-of-the-art models on both tasks for the Turkish language.
- Score: 5.5217350574838875
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Morphological information is important for many sequence labeling tasks in
Natural Language Processing (NLP). Yet, existing approaches rely heavily on
manual annotations or external software to capture this information. In this
study, we propose using subword contextual embeddings to capture the
morphological information for languages with rich morphology. In addition, we
incorporate these embeddings in a hierarchical multi-task setting which is not
employed before, to the best of our knowledge. Evaluated on Dependency Parsing
(DEP) and Named Entity Recognition (NER) tasks, which are shown to benefit
greatly from morphological information, our final model outperforms previous
state-of-the-art models on both tasks for the Turkish language. Besides, we
show a net improvement of 18.86% and 4.61% F-1 over the previously proposed
multi-task learner in the same setting for the DEP and the NER tasks,
respectively. Empirical results for five different MTL settings show that
incorporating subword contextual embeddings brings significant improvements for
both tasks. In addition, we observed that multi-task learning consistently
improves the performance of the DEP component.
Related papers
- Layer by Layer: Uncovering Where Multi-Task Learning Happens in Instruction-Tuned Large Language Models [22.676688441884465]
Fine-tuning pre-trained large language models (LLMs) on a diverse array of tasks has become a common approach for building models.
This study investigates the task-specific information encoded in pre-trained LLMs and the effects of instruction tuning on their representations.
arXiv Detail & Related papers (2024-10-25T23:38:28Z) - VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models [76.94378391979228]
We introduce a new, more demanding task known as Interleaved Image-Text (IITC)
This task challenges models to discern and disregard superfluous elements in both images and text to accurately answer questions.
In support of this task, we further craft a new VEGA dataset, tailored for the IITC task on scientific content, and devised a subtask, Image-Text Association (ITA)
arXiv Detail & Related papers (2024-06-14T17:59:40Z) - Multi-Task Learning for Front-End Text Processing in TTS [15.62497569424995]
We propose a multi-task learning (MTL) model for jointly performing three tasks that are commonly solved in a text-to-speech front-end.
Our framework utilizes a tree-like structure with a trunk that learns shared representations, followed by separate task-specific heads.
arXiv Detail & Related papers (2024-01-12T02:13:21Z) - Rethinking and Improving Multi-task Learning for End-to-end Speech
Translation [51.713683037303035]
We investigate the consistency between different tasks, considering different times and modules.
We find that the textual encoder primarily facilitates cross-modal conversion, but the presence of noise in speech impedes the consistency between text and speech representations.
We propose an improved multi-task learning (IMTL) approach for the ST task, which bridges the modal gap by mitigating the difference in length and representation.
arXiv Detail & Related papers (2023-11-07T08:48:46Z) - Effective Cross-Task Transfer Learning for Explainable Natural Language
Inference with T5 [50.574918785575655]
We compare sequential fine-tuning with a model for multi-task learning in the context of boosting performance on two tasks.
Our results show that while sequential multi-task learning can be tuned to be good at the first of two target tasks, it performs less well on the second and additionally struggles with overfitting.
arXiv Detail & Related papers (2022-10-31T13:26:08Z) - Visualizing the Relationship Between Encoded Linguistic Information and
Task Performance [53.223789395577796]
We study the dynamic relationship between the encoded linguistic information and task performance from the viewpoint of Pareto Optimality.
We conduct experiments on two popular NLP tasks, i.e., machine translation and language modeling, and investigate the relationship between several kinds of linguistic information and task performances.
Our empirical findings suggest that some syntactic information is helpful for NLP tasks whereas encoding more syntactic information does not necessarily lead to better performance.
arXiv Detail & Related papers (2022-03-29T19:03:10Z) - Incorporating Linguistic Knowledge for Abstractive Multi-document
Summarization [20.572283625521784]
We develop a neural network based abstractive multi-document summarization (MDS) model.
We process the dependency information into the linguistic-guided attention mechanism.
With the help of linguistic signals, sentence-level relations can be correctly captured.
arXiv Detail & Related papers (2021-09-23T08:13:35Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z) - Mutlitask Learning for Cross-Lingual Transfer of Semantic Dependencies [21.503766432869437]
We develop broad-coverage semantic dependencys for languages with no semantically annotated resource.
We leverage a multitask learning framework coupled with an annotation projection method.
We show that our best multitask model improves the labeled F1 score over the single-task baseline by 1.8 in the in-domain SemEval data.
arXiv Detail & Related papers (2020-04-30T17:09:51Z) - Multi-Task Learning for Dense Prediction Tasks: A Survey [87.66280582034838]
Multi-task learning (MTL) techniques have shown promising results w.r.t. performance, computations and/or memory footprint.
We provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision.
arXiv Detail & Related papers (2020-04-28T09:15:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.