Characterization of effects of transfer learning across domains and
languages
- URL: http://arxiv.org/abs/2210.01091v1
- Date: Mon, 3 Oct 2022 17:17:07 GMT
- Title: Characterization of effects of transfer learning across domains and
languages
- Authors: Sovesh Mohapatra
- Abstract summary: Transfer learning (TL) from pre-trained neural language models has emerged as a powerful technique over the years.
We investigate how TL affects the performance of popular pre-trained models over three natural language processing (NLP) tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With ever-expanding datasets of domains, tasks and languages, transfer
learning (TL) from pre-trained neural language models has emerged as a powerful
technique over the years. Many pieces of research have shown the effectiveness
of transfer learning across different domains and tasks. However, there remains
uncertainty around when a transfer will lead to positive or negative impacts on
performance of the model. To understand the uncertainty, we investigate how TL
affects the performance of popular pre-trained models like BERT, RoBERTa and
XLNet over three natural language processing (NLP) tasks. We believe this work
will inform about specifics on when and what to transfer related to domain,
multi-lingual dataset and various NLP tasks.
Related papers
- Defining Boundaries: The Impact of Domain Specification on Cross-Language and Cross-Domain Transfer in Machine Translation [0.44601285466405083]
Cross-lingual transfer learning offers a promising solution for neural machine translation (NMT)
This paper focuses on the impact of domain specification and linguistic factors on transfer effectiveness.
We evaluate multiple target languages, including Portuguese, Italian, French, Czech, Polish, and Greek.
arXiv Detail & Related papers (2024-08-21T18:28:48Z) - Subspace Chronicles: How Linguistic Information Emerges, Shifts and
Interacts during Language Model Training [56.74440457571821]
We analyze tasks covering syntax, semantics and reasoning, across 2M pre-training steps and five seeds.
We identify critical learning phases across tasks and time, during which subspaces emerge, share information, and later disentangle to specialize.
Our findings have implications for model interpretability, multi-task learning, and learning from limited data.
arXiv Detail & Related papers (2023-10-25T09:09:55Z) - Analysing Cross-Lingual Transfer in Low-Resourced African Named Entity
Recognition [0.10641561702689348]
We investigate the properties of cross-lingual transfer learning between ten low-resourced languages.
We find that models that perform well on a single language often do so at the expense of generalising to others.
The amount of data overlap between the source and target datasets is a better predictor of transfer performance than either the geographical or genetic distance between the languages.
arXiv Detail & Related papers (2023-09-11T08:56:47Z) - Cross-lingual Transferring of Pre-trained Contextualized Language Models [73.97131976850424]
We propose a novel cross-lingual model transferring framework for PrLMs: TreLM.
To handle the symbol order and sequence length differences between languages, we propose an intermediate TRILayer" structure.
We show the proposed framework significantly outperforms language models trained from scratch with limited data in both performance and efficiency.
arXiv Detail & Related papers (2021-07-27T06:51:13Z) - What is being transferred in transfer learning? [51.6991244438545]
We show that when training from pre-trained weights, the model stays in the same basin in the loss landscape.
We present that when training from pre-trained weights, the model stays in the same basin in the loss landscape and different instances of such model are similar in feature space and close in parameter space.
arXiv Detail & Related papers (2020-08-26T17:23:40Z) - Exploring and Predicting Transferability across NLP Tasks [115.6278033699853]
We study the transferability between 33 NLP tasks across three broad classes of problems.
Our results show that transfer learning is more beneficial than previously thought.
We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task.
arXiv Detail & Related papers (2020-05-02T09:39:36Z) - From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual
Transfer with Multilingual Transformers [62.637055980148816]
Massively multilingual transformers pretrained with language modeling objectives have become a de facto default transfer paradigm for NLP.
We show that cross-lingual transfer via massively multilingual transformers is substantially less effective in resource-lean scenarios and for distant languages.
arXiv Detail & Related papers (2020-05-01T22:04:58Z) - Inter- and Intra-domain Knowledge Transfer for Related Tasks in Deep
Character Recognition [2.320417845168326]
Pre-training a deep neural network on the ImageNet dataset is a common practice for training deep learning models.
The technique of pre-training on one task and then retraining on a new one is called transfer learning.
In this paper we analyse the effectiveness of using deep transfer learning for character recognition tasks.
arXiv Detail & Related papers (2020-01-02T14:18:25Z) - Exploring the Limits of Transfer Learning with a Unified Text-to-Text
Transformer [64.22926988297685]
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP)
In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts all text-based language problems into a text-to-text format.
arXiv Detail & Related papers (2019-10-23T17:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.