Few-Shot Structured Policy Learning for Multi-Domain and Multi-Task
Dialogues
- URL: http://arxiv.org/abs/2302.11199v1
- Date: Wed, 22 Feb 2023 08:18:49 GMT
- Title: Few-Shot Structured Policy Learning for Multi-Domain and Multi-Task
Dialogues
- Authors: Thibault Cordier and Tanguy Urvoy and Fabrice Lefevre and Lina M.
Rojas-Barahona
- Abstract summary: Graph neural networks (GNNs) show a remarkable superiority by reaching a success rate above 80% with only 50 dialogues, when learning from simulated experts.
We suggest to concentrate future research efforts on bridging the gap between human data, simulators and automatic evaluators in dialogue frameworks.
- Score: 0.716879432974126
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Reinforcement learning has been widely adopted to model dialogue managers in
task-oriented dialogues. However, the user simulator provided by
state-of-the-art dialogue frameworks are only rough approximations of human
behaviour. The ability to learn from a small number of human interactions is
hence crucial, especially on multi-domain and multi-task environments where the
action space is large. We therefore propose to use structured policies to
improve sample efficiency when learning on these kinds of environments. We also
evaluate the impact of learning from human vs simulated experts. Among the
different levels of structure that we tested, the graph neural networks (GNNs)
show a remarkable superiority by reaching a success rate above 80% with only 50
dialogues, when learning from simulated experts. They also show superiority
when learning from human experts, although a performance drop was observed,
indicating a possible difficulty in capturing the variability of human
strategies. We therefore suggest to concentrate future research efforts on
bridging the gap between human data, simulators and automatic evaluators in
dialogue frameworks.
Related papers
- PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - PK-ICR: Persona-Knowledge Interactive Context Retrieval for Grounded Dialogue [21.266410719325208]
Persona and Knowledge Dual Context Identification is a task to identify persona and knowledge jointly for a given dialogue.
We develop a novel grounding retrieval method that utilizes all contexts of dialogue simultaneously.
arXiv Detail & Related papers (2023-02-13T20:27:26Z) - Opportunities and Challenges in Neural Dialog Tutoring [54.07241332881601]
We rigorously analyze various generative language models on two dialog tutoring datasets for language learning.
We find that although current approaches can model tutoring in constrained learning scenarios, they perform poorly in less constrained scenarios.
Our human quality evaluation shows that both models and ground-truth annotations exhibit low performance in terms of equitable tutoring.
arXiv Detail & Related papers (2023-01-24T11:00:17Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - Retrieval Augmentation Reduces Hallucination in Conversation [49.35235945543833]
We explore the use of neural-retrieval-in-the-loop architectures for knowledge-grounded dialogue.
We show that our best models obtain state-of-the-art performance on two knowledge-grounded conversational tasks.
arXiv Detail & Related papers (2021-04-15T16:24:43Z) - Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy
Evaluation Approach [84.02388020258141]
We propose a new framework named ENIGMA for estimating human evaluation scores based on off-policy evaluation in reinforcement learning.
ENIGMA only requires a handful of pre-collected experience data, and therefore does not involve human interaction with the target policy during the evaluation.
Our experiments show that ENIGMA significantly outperforms existing methods in terms of correlation with human evaluation scores.
arXiv Detail & Related papers (2021-02-20T03:29:20Z) - Rethinking Supervised Learning and Reinforcement Learning in
Task-Oriented Dialogue Systems [58.724629408229205]
We demonstrate how traditional supervised learning and a simulator-free adversarial learning method can be used to achieve performance comparable to state-of-the-art RL-based methods.
Our main goal is not to beat reinforcement learning with supervised learning, but to demonstrate the value of rethinking the role of reinforcement learning and supervised learning in optimizing task-oriented dialogue systems.
arXiv Detail & Related papers (2020-09-21T12:04:18Z) - Adaptive Dialog Policy Learning with Hindsight and User Modeling [10.088347529930129]
We develop algorithm LHUA that, for the first time, enables dialog agents to adaptively learn with hindsight from both simulated and real users.
Experimental results suggest that, in success rate and policy quality, LHUA outperforms competitive baselines from the literature.
arXiv Detail & Related papers (2020-05-07T07:43:43Z) - Learning from Easy to Complex: Adaptive Multi-curricula Learning for
Neural Dialogue Generation [40.49175137775255]
Current state-of-the-art neural dialogue systems are mainly data-driven and are trained on human-generated responses.
We propose an adaptive multi-curricula learning framework to schedule a committee of the organized curricula.
arXiv Detail & Related papers (2020-03-02T03:09:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.