Sm{\aa}prat: DialoGPT for Natural Language Generation of Swedish
Dialogue by Transfer Learning
- URL: http://arxiv.org/abs/2110.06273v1
- Date: Tue, 12 Oct 2021 18:46:43 GMT
- Title: Sm{\aa}prat: DialoGPT for Natural Language Generation of Swedish
Dialogue by Transfer Learning
- Authors: Tosin Adewumi, Nosheen Abid, Maryam Pahlavan, Rickard Br\"annvall,
Sana Sabah Sabry, Foteini Liwicki and Marcus Liwicki
- Abstract summary: State-of-the-art models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English.
This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language.
- Score: 1.6111818380407035
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Building open-domain conversational systems (or chatbots) that produce
convincing responses is a recognized challenge. Recent state-of-the-art (SoTA)
transformer-based models for the generation of natural language dialogue have
demonstrated impressive performance in simulating human-like, single-turn
conversations in English. This work investigates, by an empirical study, the
potential for transfer learning of such models to Swedish language. DialoGPT,
an English language pre-trained model, is adapted by training on three
different Swedish language conversational datasets obtained from publicly
available sources. Perplexity score (an automated intrinsic language model
metric) and surveys by human evaluation were used to assess the performances of
the fine-tuned models, with results that indicate that the capacity for
transfer learning can be exploited with considerable success. Human evaluators
asked to score the simulated dialogue judged over 57% of the chatbot responses
to be human-like for the model trained on the largest (Swedish) dataset. We
provide the demos and model checkpoints of our English and Swedish chatbots on
the HuggingFace platform for public use.
Related papers
- Evaluating Large Language Models with Human Feedback: Establishing a Swedish Benchmark [0.0]
Large language models (LLMs) have demonstrated significant capabilities across numerous applications.
This study introduces a comprehensive human benchmark to assess the efficacy of prominent LLMs in understanding and generating Swedish language texts.
arXiv Detail & Related papers (2024-05-22T21:22:51Z) - Can Language Models Learn to Listen? [96.01685069483025]
We present a framework for generating appropriate facial responses from a listener in dyadic social interactions based on the speaker's words.
Our approach autoregressively predicts a response of a listener: a sequence of listener facial gestures, quantized using a VQ-VAE.
We show that our generated listener motion is fluent and reflective of language semantics through quantitative metrics and a qualitative user study.
arXiv Detail & Related papers (2023-08-21T17:59:02Z) - Chain of Hindsight Aligns Language Models with Feedback [62.68665658130472]
We propose a novel technique, Chain of Hindsight, that is easy to optimize and can learn from any form of feedback, regardless of its polarity.
We convert all types of feedback into sequences of sentences, which are then used to fine-tune the model.
By doing so, the model is trained to generate outputs based on feedback, while learning to identify and correct negative attributes or errors.
arXiv Detail & Related papers (2023-02-06T10:28:16Z) - Human Language Modeling [20.66485974271458]
We propose a hierarchical extension to the language modeling problem whereby a human-level exists to connect sequences of documents.
We introduce, HaRT, a large-scale transformer model for the HuLM task, pre-trained on approximately 100,000 social media users.
Results on all tasks meet or surpass the current state-of-the-art.
arXiv Detail & Related papers (2022-05-10T19:11:12Z) - Training Language Models with Natural Language Feedback [51.36137482891037]
We learn from language feedback on model outputs using a three-step learning algorithm.
In synthetic experiments, we first evaluate whether language models accurately incorporate feedback to produce refinements.
Using only 100 samples of human-written feedback, our learning algorithm finetunes a GPT-3 model to roughly human-level summarization.
arXiv Detail & Related papers (2022-04-29T15:06:58Z) - Building a Swedish Open-Domain Conversational Language Model [0.0]
We present on-going work of evaluating the, to our knowledge, first large generative language model trained to converse in Swedish.
We conduct a human evaluation pilot study that indicates the model is often able to respond to conversations in both a human-like and informative manner.
arXiv Detail & Related papers (2021-04-12T08:18:48Z) - A Visuospatial Dataset for Naturalistic Verb Learning [18.654373173232205]
We introduce a new dataset for training and evaluating grounded language models.
Our data is collected within a virtual reality environment and is designed to emulate the quality of language data to which a pre-verbal child is likely to have access.
We use the collected data to compare several distributional semantics models for verb learning.
arXiv Detail & Related papers (2020-10-28T20:47:13Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - The Adapter-Bot: All-In-One Controllable Conversational Model [66.48164003532484]
We propose a dialogue model that uses a fixed backbone model such as DialGPT and triggers on-demand dialogue skills via different adapters.
Depending on the skills, the model is able to process multiple knowledge types, such as text, tables, and emphatic responses.
We evaluate our model using automatic evaluation by comparing it with existing state-of-the-art conversational models.
arXiv Detail & Related papers (2020-08-28T10:59:31Z) - XPersona: Evaluating Multilingual Personalized Chatbot [76.00426517401894]
We propose a multi-lingual extension of Persona-Chat, namely XPersona.
Our dataset includes persona conversations in six different languages other than English for building and evaluating multilingual personalized agents.
arXiv Detail & Related papers (2020-03-17T07:52:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.