Related papers: Role-Play Zero-Shot Prompting with Large Language Models for Open-Domain Human-Machine Conversation

Role-Play Zero-Shot Prompting with Large Language Models for Open-Domain Human-Machine Conversation

URL: http://arxiv.org/abs/2406.18460v1
Date: Wed, 26 Jun 2024 16:10:53 GMT
Title: Role-Play Zero-Shot Prompting with Large Language Models for Open-Domain Human-Machine Conversation
Authors: Ahmed Njifenjou, Virgile Sucal, Bassam Jabaian, Fabrice Lefèvre,
Abstract summary: Large Language Models (LLMs) are able to answer user queries, but in a one-way Q&A format rather than a true conversation. Fine-tuning on particular datasets is the usual way to modify their style to increase conversational ability, but this is expensive and usually only available in a few languages. In this study, we explore role-play zero-shot prompting as an efficient and cost-effective solution for open-domain conversation.
Score: 1.7436854281619139
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, various methods have been proposed to create open-domain conversational agents with Large Language Models (LLMs). These models are able to answer user queries, but in a one-way Q&A format rather than a true conversation. Fine-tuning on particular datasets is the usual way to modify their style to increase conversational ability, but this is expensive and usually only available in a few languages. In this study, we explore role-play zero-shot prompting as an efficient and cost-effective solution for open-domain conversation, using capable multilingual LLMs (Beeching et al., 2023) trained to obey instructions. We design a prompting system that, when combined with an instruction-following model - here Vicuna (Chiang et al., 2023) - produces conversational agents that match and even surpass fine-tuned models in human evaluation in French in two different tasks.

Related papers

Open-Source Large Language Models as Multilingual Crowdworkers: Synthesizing Open-Domain Dialogues in Several Languages With No Examples in Targets and No Machine Translation [1.7436854281619139]
We introduce a pipeline for generating Open-Domain Dialogue data in multiple Target Languages using Large Language Models. To enhance the openness of generated dialogues and mimic real life scenarii, we added the notion of speech events corresponding to the type of conversation the speakers are involved in.
arXiv Detail & Related papers (2025-03-05T12:52:14Z)
Modeling Real-Time Interactive Conversations as Timed Diarized Transcripts [11.067252960486272]
We present a simple yet general method to simulate real-time interactive conversations using pretrained language models. We demonstrate the promise of this method with two case studies: instant messenger dialogues and spoken conversations.
arXiv Detail & Related papers (2024-05-21T21:14:31Z)
PolyLM: An Open Source Polyglot Large Language Model [57.64420154135178]
We present PolyLM, a multilingual large language model (LLMs) trained on 640 billion (B) tokens, avaliable in two model sizes: 1.7B and 13B. To enhance its multilingual capabilities, we 1) integrate bilingual data into training data; and 2) adopt a curriculum learning strategy that increases the proportion of non-English data from 30% in the first stage to 60% in the final stage during pre-training. Further, we propose a multilingual self-instruct method which automatically generates 132.7K diverse multilingual instructions for model fine-tuning.
arXiv Detail & Related papers (2023-07-12T09:00:37Z)
Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models [39.80729604768669]
We evaluate the ability of language models to act as one or more characters in multi-party conversations. We find that our new dataset, MultiLIGHT, can help bring significant improvements in the group setting.
arXiv Detail & Related papers (2023-04-26T21:41:17Z)
Efficiently Aligned Cross-Lingual Transfer Learning for Conversational Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks. We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset. To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z)
Prompting for a conversation: How to control a dialog model? [9.268682116424518]
Dialog models are trained on a large amount of text, yet their responses need to be limited to a desired scope and style of a dialog agent. Because the datasets used to achieve the former contain language that is not compatible with the latter, pre-trained dialog models are fine-tuned on smaller curated datasets. In this paper we investigate if prompting can mitigate the above trade-off.
arXiv Detail & Related papers (2022-09-22T14:59:55Z)
Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt [98.26682501616024]
We propose a novel model that uses a unified prompt for all languages, called UniPrompt. The unified prompt is computation by a multilingual PLM to produce language-independent representation. Our proposed methods can significantly outperform the strong baselines across different languages.
arXiv Detail & Related papers (2022-02-23T11:57:52Z)
Plug-and-Play Conversational Models [62.77150879036442]
We introduce an approach that does not require further computation at decoding time, while also does not require any fine-tuning of a large language model. We demonstrate, through extensive automatic and human evaluation, a high degree of control over the generated conversational responses with regard to multiple desired attributes.
arXiv Detail & Related papers (2020-10-09T03:17:51Z)
The Adapter-Bot: All-In-One Controllable Conversational Model [66.48164003532484]
We propose a dialogue model that uses a fixed backbone model such as DialGPT and triggers on-demand dialogue skills via different adapters. Depending on the skills, the model is able to process multiple knowledge types, such as text, tables, and emphatic responses. We evaluate our model using automatic evaluation by comparing it with existing state-of-the-art conversational models.
arXiv Detail & Related papers (2020-08-28T10:59:31Z)
Efficient Deployment of Conversational Natural Language Interfaces over Databases [45.52672694140881]
We propose a novel method for accelerating the training dataset collection for developing the natural language-to-query-language machine learning models. Our system allows one to generate conversational multi-term data, where multiple turns define a dialogue session.
arXiv Detail & Related papers (2020-05-31T19:16:27Z)
XPersona: Evaluating Multilingual Personalized Chatbot [76.00426517401894]
We propose a multi-lingual extension of Persona-Chat, namely XPersona. Our dataset includes persona conversations in six different languages other than English for building and evaluating multilingual personalized agents.
arXiv Detail & Related papers (2020-03-17T07:52:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.