Interacting with Non-Cooperative User: A New Paradigm for Proactive
Dialogue Policy
- URL: http://arxiv.org/abs/2204.07433v1
- Date: Thu, 7 Apr 2022 14:11:31 GMT
- Title: Interacting with Non-Cooperative User: A New Paradigm for Proactive
Dialogue Policy
- Authors: Wenqiang Lei, Yao Zhang, Feifan Song, Hongru Liang, Jiaxin Mao,
Jiancheng Lv, Zhenglu Yang and Tat-Seng Chua
- Abstract summary: We propose a new solution named I-Pro that can learn Proactive policy in the Interactive setting.
Specifically, we learn the trade-off via a learned goal weight, which consists of four factors.
The experimental results demonstrate I-Pro significantly outperforms baselines in terms of effectiveness and interpretability.
- Score: 83.61404191470126
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Proactive dialogue system is able to lead the conversation to a goal topic
and has advantaged potential in bargain, persuasion and negotiation. Current
corpus-based learning manner limits its practical application in real-world
scenarios. To this end, we contribute to advance the study of the proactive
dialogue policy to a more natural and challenging setting, i.e., interacting
dynamically with users. Further, we call attention to the non-cooperative user
behavior -- the user talks about off-path topics when he/she is not satisfied
with the previous topics introduced by the agent. We argue that the targets of
reaching the goal topic quickly and maintaining a high user satisfaction are
not always converge, because the topics close to the goal and the topics user
preferred may not be the same. Towards this issue, we propose a new solution
named I-Pro that can learn Proactive policy in the Interactive setting.
Specifically, we learn the trade-off via a learned goal weight, which consists
of four factors (dialogue turn, goal completion difficulty, user satisfaction
estimation, and cooperative degree). The experimental results demonstrate I-Pro
significantly outperforms baselines in terms of effectiveness and
interpretability.
Related papers
- Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training [33.57497419019826]
Action-Based Contrastive Self-Training allows for sample-efficient dialogue policy learning in multi-turn conversation.
ACT demonstrates substantial conversation modeling improvements over standard approaches to supervised fine-tuning and DPO.
arXiv Detail & Related papers (2024-05-31T22:44:48Z) - An Analysis of User Behaviors for Objectively Evaluating Spoken Dialogue
Systems [26.003947740875482]
We investigate the relationship between user behaviors and subjective evaluation scores in social dialogue tasks.
The results reveal that in dialogue tasks where user utterances are primary, like attentive listening and job interview, indicators like the number of utterances and words play a significant role in evaluation.
arXiv Detail & Related papers (2024-01-10T01:02:26Z) - Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations [70.7884839812069]
Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks.
However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome.
In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue.
arXiv Detail & Related papers (2023-11-09T18:45:16Z) - A Survey on Proactive Dialogue Systems: Problems, Methods, and Prospects [100.75759050696355]
We provide a comprehensive overview of the prominent problems and advanced designs for conversational agent's proactivity in different types of dialogues.
We discuss challenges that meet the real-world application needs but require a greater research focus in the future.
arXiv Detail & Related papers (2023-05-04T11:38:49Z) - User Satisfaction Estimation with Sequential Dialogue Act Modeling in
Goal-oriented Conversational Systems [65.88679683468143]
We propose a novel framework, namely USDA, to incorporate the sequential dynamics of dialogue acts for predicting user satisfaction.
USDA incorporates the sequential transitions of both content and act features in the dialogue to predict the user satisfaction.
Experimental results on four benchmark goal-oriented dialogue datasets show that the proposed method substantially and consistently outperforms existing methods on USE.
arXiv Detail & Related papers (2022-02-07T02:50:07Z) - What Does The User Want? Information Gain for Hierarchical Dialogue
Policy Optimisation [3.1433893853959605]
optimisation via reinforcement learning (RL) is susceptible to sample inefficiency and instability.
We propose the usage of an intrinsic reward based on information gain to address this issue.
Our algorithm, which we call FeudalGain, achieves state-of-the-art results in most environments of the PyDial framework.
arXiv Detail & Related papers (2021-09-15T07:21:26Z) - Optimizing Interactive Systems via Data-Driven Objectives [70.3578528542663]
We propose an approach that infers the objective directly from observed user interactions.
These inferences can be made regardless of prior knowledge and across different types of user behavior.
We introduce Interactive System (ISO), a novel algorithm that uses these inferred objectives for optimization.
arXiv Detail & Related papers (2020-06-19T20:49:14Z) - Dynamic Knowledge Routing Network For Target-Guided Open-Domain
Conversation [79.7781436501706]
We propose a structured approach that controls the intended content of system responses by introducing coarse-grained keywords.
We also propose a novel dual discourse-level target-guided strategy to guide conversations to reach their goals smoothly with higher success rate.
arXiv Detail & Related papers (2020-02-04T09:49:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.