Transfer Learning for British Sign Language Modelling
- URL: http://arxiv.org/abs/2006.02144v1
- Date: Wed, 3 Jun 2020 10:13:29 GMT
- Title: Transfer Learning for British Sign Language Modelling
- Authors: Boris Mocialov, Graham Turner, Helen Hastie
- Abstract summary: Research in minority languages, including sign languages, is hampered by the severe lack of data.
This has led to work on transfer learning methods, whereby a model developed for one language is reused as the starting point for a model on a second language.
In this paper, we examine two transfer learning techniques of fine-tuning and layer substitution for language modelling of British Sign Language.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Automatic speech recognition and spoken dialogue systems have made great
advances through the use of deep machine learning methods. This is partly due
to greater computing power but also through the large amount of data available
in common languages, such as English. Conversely, research in minority
languages, including sign languages, is hampered by the severe lack of data.
This has led to work on transfer learning methods, whereby a model developed
for one language is reused as the starting point for a model on a second
language, which is less resourced. In this paper, we examine two transfer
learning techniques of fine-tuning and layer substitution for language
modelling of British Sign Language. Our results show improvement in perplexity
when using transfer learning with standard stacked LSTM models, trained
initially using a large corpus for standard English from the Penn Treebank
corpus
Related papers
- LEIA: Facilitating Cross-lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation [21.980770995466134]
We introduce LEIA, a language adaptation tuning method that utilizes Wikipedia entity names aligned across languages.
This method involves augmenting the target language corpus with English entity names and training the model using left-to-right language modeling.
arXiv Detail & Related papers (2024-02-18T07:24:34Z) - Continual Learning Under Language Shift [6.0783165755651325]
We study the pros and cons of updating a language model when new data comes from new languages.
We investigate how forward and backward transfer effects depend on pre-training order and characteristics of languages.
arXiv Detail & Related papers (2023-11-02T12:54:50Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Efficient Language Model Training through Cross-Lingual and Progressive
Transfer Learning [0.7612676127275795]
Most Transformer language models are pretrained on English text.
As model sizes grow, the performance gap between English and other languages increases even further.
We introduce a cross-lingual and progressive transfer learning approach, called CLP-Transfer.
arXiv Detail & Related papers (2023-01-23T18:56:12Z) - Language Contamination Explains the Cross-lingual Capabilities of
English Pretrained Models [79.38278330678965]
We find that common English pretraining corpora contain significant amounts of non-English text.
This leads to hundreds of millions of foreign language tokens in large-scale datasets.
We then demonstrate that even these small percentages of non-English data facilitate cross-lingual transfer for models trained on them.
arXiv Detail & Related papers (2022-04-17T23:56:54Z) - WECHSEL: Effective initialization of subword embeddings for
cross-lingual transfer of monolingual language models [3.6878069324996616]
We introduce a method -- called WECHSEL -- to transfer English models to new languages.
We use WECHSEL to transfer GPT-2 and RoBERTa models to 4 other languages.
arXiv Detail & Related papers (2021-12-13T12:26:02Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z) - Improving the Lexical Ability of Pretrained Language Models for
Unsupervised Neural Machine Translation [127.81351683335143]
Cross-lingual pretraining requires models to align the lexical- and high-level representations of the two languages.
Previous research has shown that this is because the representations are not sufficiently aligned.
In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings.
arXiv Detail & Related papers (2021-03-18T21:17:58Z) - When Being Unseen from mBERT is just the Beginning: Handling New
Languages With Multilingual Language Models [2.457872341625575]
Transfer learning based on pretraining language models on a large amount of raw data has become a new norm to reach state-of-the-art performance in NLP.
We show that such models behave in multiple ways on unseen languages.
arXiv Detail & Related papers (2020-10-24T10:15:03Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Meta-Transfer Learning for Code-Switched Speech Recognition [72.84247387728999]
We propose a new learning method, meta-transfer learning, to transfer learn on a code-switched speech recognition system in a low-resource setting.
Our model learns to recognize individual languages, and transfer them so as to better recognize mixed-language speech by conditioning the optimization on the code-switching data.
arXiv Detail & Related papers (2020-04-29T14:27:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.