A Mixture-of-Expert Approach to RL-based Dialogue Management
- URL: http://arxiv.org/abs/2206.00059v1
- Date: Tue, 31 May 2022 19:00:41 GMT
- Title: A Mixture-of-Expert Approach to RL-based Dialogue Management
- Authors: Yinlam Chow and Aza Tulepbergenov and Ofir Nachum and MoonKyung Ryu
and Mohammad Ghavamzadeh and Craig Boutilier
- Abstract summary: We use reinforcement learning to develop a dialogue agent that avoids being short-sighted (outputting generic utterances) and maximizes overall user satisfaction.
Most existing RL approaches to DM train the agent at the word-level, and thus, have to deal with aly complex action space even for a medium-size vocabulary.
We develop a RL-based DM using a novel mixture of expert language model (MoE-LM) that consists of (i) a LM capable of learning diverse semantics for conversation histories, (ii) a number of specialized LMs (or experts) capable of generating utterances corresponding to a
- Score: 56.08449336469477
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite recent advancements in language models (LMs), their application to
dialogue management (DM) problems and ability to carry on rich conversations
remain a challenge. We use reinforcement learning (RL) to develop a dialogue
agent that avoids being short-sighted (outputting generic utterances) and
maximizes overall user satisfaction. Most existing RL approaches to DM train
the agent at the word-level, and thus, have to deal with a combinatorially
complex action space even for a medium-size vocabulary. As a result, they
struggle to produce a successful and engaging dialogue even if they are
warm-started with a pre-trained LM. To address this issue, we develop a
RL-based DM using a novel mixture of expert language model (MoE-LM) that
consists of (i) a LM capable of learning diverse semantics for conversation
histories, (ii) a number of {\em specialized} LMs (or experts) capable of
generating utterances corresponding to a particular attribute or personality,
and (iii) a RL-based DM that performs dialogue planning with the utterances
generated by the experts. Our MoE approach provides greater flexibility to
generate sensible utterances with different intents and allows RL to focus on
conversational-level DM. We compare it with SOTA baselines on open-domain
dialogues and demonstrate its effectiveness both in terms of the diversity and
sensibility of the generated utterances and the overall DM performance.
Related papers
- Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations [58.65755268815283]
Many real dialogues are interactive, meaning an agent's utterances will influence their conversational partner, elicit information, or change their opinion.
We use this fact to rewrite and augment existing suboptimal data, and train via offline reinforcement learning (RL) an agent that outperforms both prompting and learning from unaltered human demonstrations.
Our results in a user study with real humans show that our approach greatly outperforms existing state-of-the-art dialogue agents.
arXiv Detail & Related papers (2024-11-07T21:37:51Z) - StyleChat: Learning Recitation-Augmented Memory in LLMs for Stylized Dialogue Generation [43.29667566560533]
We introduce a stylized dialogue dataset StyleEval with 38 styles by leveraging the generative power of Large Language Models (LLMs)
We propose the stylized dialogue framework StyleChat via recitation-augmented memory strategy and multi-task style learning strategy to promote generalization ability.
arXiv Detail & Related papers (2024-03-18T03:26:18Z) - LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language
Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs)
Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z) - Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations [70.7884839812069]
Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks.
However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome.
In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue.
arXiv Detail & Related papers (2023-11-09T18:45:16Z) - BotChat: Evaluating LLMs' Capabilities of Having Multi-Turn Dialogues [72.65163468440434]
This report provides a preliminary evaluation of existing large language models for human-style multi-turn chatting.
We prompt large language models (LLMs) to generate a full multi-turn dialogue based on the ChatSEED, utterance by utterance.
We find GPT-4 can generate human-style multi-turn dialogues with impressive quality, significantly outperforms its counterparts.
arXiv Detail & Related papers (2023-10-20T16:53:51Z) - Self-Explanation Prompting Improves Dialogue Understanding in Large
Language Models [52.24756457516834]
We propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of Large Language Models (LLMs)
This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks.
Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts.
arXiv Detail & Related papers (2023-09-22T15:41:34Z) - Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management [36.254564021059515]
Reinforcement learning (RL) has shown great promise for developing dialogue management (DM) agents that are non-myopic.
We develop a variety of RL algorithms, specialized to dialogue planning, that leverage recent Mixture-of-Expert Language Models (MoE-LMs)
By exploiting MoE-LM structure, our methods significantly reduce the size of the action space and improve the efficacy of RL-based DM.
arXiv Detail & Related papers (2023-02-21T18:02:20Z) - Integrating Pre-trained Model into Rule-based Dialogue Management [32.90885176553305]
Rule-based dialogue management is still the most popular solution for industrial task-oriented dialogue systems.
Data-driven dialogue systems, usually with end-to-end structures, are popular in academic research.
We propose a method to leverage the strength of both rule-based and data-driven dialogue managers.
arXiv Detail & Related papers (2021-02-17T03:44:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.