Learning Natural Language Generation from Scratch
- URL: http://arxiv.org/abs/2109.09371v1
- Date: Mon, 20 Sep 2021 08:46:51 GMT
- Title: Learning Natural Language Generation from Scratch
- Authors: Alice Martin Donati (X-DEP-MATHAPP), Guillaume Quispe, Charles Ollion,
Sylvain Le Corff, Florian Strub, Olivier Pietquin
- Abstract summary: This paper introduces TRUncated ReinForcement Learning for Language (TrufLL)
It is an original ap-proach to train conditional language models from scratch by only using reinforcement learning (RL)
- Score: 25.984828046001013
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces TRUncated ReinForcement Learning for Language (TrufLL),
an original ap-proach to train conditional language models from scratch by only
using reinforcement learning (RL). AsRL methods unsuccessfully scale to large
action spaces, we dynamically truncate the vocabulary spaceusing a generic
language model. TrufLL thus enables to train a language agent by solely
interacting withits environment without any task-specific prior knowledge; it
is only guided with a task-agnostic languagemodel. Interestingly, this approach
avoids the dependency to labelled datasets and inherently reduces pre-trained
policy flaws such as language or exposure biases. We evaluate TrufLL on two
visual questiongeneration tasks, for which we report positive results over
performance and language metrics, which wethen corroborate with a human
evaluation. To our knowledge, it is the first approach that successfullylearns
a language generation policy (almost) from scratch.
Related papers
- Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Offline RL for Natural Language Generation with Implicit Language Q
Learning [87.76695816348027]
Large language models can be inconsistent when it comes to completing user specified tasks.
We propose a novel RL method, that combines both the flexible utility framework of RL with the ability of supervised learning.
In addition to empirically validating ILQL, we present a detailed empirical analysis situations where offline RL can be useful in natural language generation settings.
arXiv Detail & Related papers (2022-06-05T18:38:42Z) - Persian Natural Language Inference: A Meta-learning approach [6.832341432995628]
This paper proposes a meta-learning approach for inferring natural language in Persian.
We evaluate the proposed method using four languages and an auxiliary task.
arXiv Detail & Related papers (2022-05-18T06:51:58Z) - Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis [87.75833205560406]
This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system.
It does not require pooled data from all languages altogether, and thus alleviates the storage and computation burden.
arXiv Detail & Related papers (2021-10-09T07:00:38Z) - Simple and Effective Zero-shot Cross-lingual Phoneme Recognition [46.76787843369816]
This paper extends previous work on zero-shot cross-lingual transfer learning by fine-tuning a multilingually pretrained wav2vec 2.0 model to transcribe unseen languages.
Experiments show that this simple method significantly outperforms prior work which introduced task-specific architectures.
arXiv Detail & Related papers (2021-09-23T22:50:32Z) - Improving the Lexical Ability of Pretrained Language Models for
Unsupervised Neural Machine Translation [127.81351683335143]
Cross-lingual pretraining requires models to align the lexical- and high-level representations of the two languages.
Previous research has shown that this is because the representations are not sufficiently aligned.
In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings.
arXiv Detail & Related papers (2021-03-18T21:17:58Z) - Pre-Training a Language Model Without Human Language [74.11825654535895]
We study how the intrinsic nature of pre-training data contributes to the fine-tuned downstream performance.
We find that models pre-trained on unstructured data beat those trained directly from scratch on downstream tasks.
To our great astonishment, we uncover that pre-training on certain non-human language data gives GLUE performance close to performance pre-trained on another non-English language.
arXiv Detail & Related papers (2020-12-22T13:38:06Z) - Learning Spoken Language Representations with Neural Lattice Language
Modeling [39.50831917042577]
We propose a framework that trains neural lattice language models to provide contextualized representations for spoken language understanding tasks.
The proposed two-stage pre-training approach reduces the demands of speech data and has better efficiency.
arXiv Detail & Related papers (2020-07-06T10:38:03Z) - Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models
via Continual Learning [74.25168207651376]
Fine-tuning pre-trained language models to downstream cross-lingual tasks has shown promising results.
We leverage continual learning to preserve the cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks.
Our methods achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.
arXiv Detail & Related papers (2020-04-29T14:07:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.