A Generative User Simulator with GPT-based Architecture and Goal State
Tracking for Reinforced Multi-Domain Dialog Systems
- URL: http://arxiv.org/abs/2210.08692v2
- Date: Tue, 18 Oct 2022 06:41:39 GMT
- Title: A Generative User Simulator with GPT-based Architecture and Goal State
Tracking for Reinforced Multi-Domain Dialog Systems
- Authors: Hong Liu, Yucheng Cai, Zhijian Ou, Yi Huang, Junlan Feng
- Abstract summary: We propose a generative user simulator (GUS) with GPT-2 based architecture and goal state tracking.
The GUS achieves superior results in all three evaluation tasks.
- Score: 22.249113574918034
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Building user simulators (USs) for reinforcement learning (RL) of
task-oriented dialog systems (DSs) has gained more and more attention, which,
however, still faces several fundamental challenges. First, it is unclear
whether we can leverage pretrained language models to design, for example,
GPT-2 based USs, to catch up and interact with the recently advanced GPT-2
based DSs. Second, an important ingredient in a US is that the user goal can be
effectively incorporated and tracked; but how to flexibly integrate goal state
tracking and develop an end-to-end trainable US for multi-domains has remained
to be a challenge. In this work, we propose a generative user simulator (GUS)
with GPT-2 based architecture and goal state tracking towards addressing the
above two challenges. Extensive experiments are conducted on MultiWOZ2.1.
Different DSs are trained via RL with GUS, the classic agenda-based user
simulator (ABUS) and other ablation simulators respectively, and are compared
for cross-model evaluation, corpus-based evaluation and human evaluation. The
GUS achieves superior results in all three evaluation tasks.
Related papers
- Goal Alignment in LLM-Based User Simulators for Conversational AI [14.771856490513194]
User simulators are essential to conversational AI, enabling scalable agent development and evaluation through simulated interactions.<n>We introduce User Goal State Tracking (U GST), a novel framework that tracks user goal progression throughout conversations.<n>We present a three-stage methodology for developing user simulators that can autonomously track goal progression and reason to generate goal-aligned responses.
arXiv Detail & Related papers (2025-07-27T07:07:12Z) - Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles [37.43150003866563]
User simulators are crucial for replicating human interactions with dialogue systems.
We propose User Simulator with implicit Profiles (USP), a framework that infers implicit user profiles from human-machine conversations.
USP outperforms strong baselines in terms of authenticity and diversity while achieving comparable performance in consistency.
arXiv Detail & Related papers (2025-02-26T09:26:54Z) - Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems [2.788542465279969]
This paper introduces DAUS, a Domain-Aware User Simulator.
We fine-tune DAUS on real examples of task-oriented dialogues.
Results on two relevant benchmarks showcase significant improvements in terms of user goal fulfillment.
arXiv Detail & Related papers (2024-02-20T20:57:47Z) - ProS: Prompting-to-simulate Generalized knowledge for Universal
Cross-Domain Retrieval [123.51277978744677]
We propose textbfPrompting-to-textbfSimulate (ProS) to apply prompt tuning for Universal Cross-Domain Retrieval (UCDR)
ProS employs a two-step process to simulate Content-aware Dynamic Prompts (CaDP) which can impact models to produce generalized features for UCDR.
Our method achieves new state-of-the-art performance without bringing excessive parameters.
arXiv Detail & Related papers (2023-12-19T14:39:11Z) - User Simulation with Large Language Models for Evaluating Task-Oriented
Dialogue [10.336443286833145]
We propose a novel user simulator built using recently developed large pretrained language models (LLMs)
Unlike previous work, which sought to maximize goal success rate (GSR) as the primary metric of simulator performance, our goal is a system which achieves a GSR similar to that observed in human interactions with TOD systems.
arXiv Detail & Related papers (2023-09-23T02:04:57Z) - End-to-end Tracking with a Multi-query Transformer [96.13468602635082]
Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time.
Our aim in this paper is to move beyond tracking-by-detection approaches, to class-agnostic tracking that performs well also for unknown object classes.
arXiv Detail & Related papers (2022-10-26T10:19:37Z) - Jointly Reinforced User Simulator and Task-oriented Dialog System with
Simplified Generative Architecture [24.305558215176752]
Online reinforcement learning of a GPT-2 based dialog system (DS) and a end-to-end user simulator (US) has not ever been explored.
In this paper, we first propose Simplified Generative Architectures (SGA) for DS and US respectively, both based on GPT-2 but using shortened history.
Our DS with the proposed SGA, when only supervised trained, achieves state-of-the-art performance on MultiWOZ2.1 and is more compute-efficient in both training and generation.
arXiv Detail & Related papers (2022-10-13T03:57:17Z) - Metaphorical User Simulators for Evaluating Task-oriented Dialogue
Systems [80.77917437785773]
Task-oriented dialogue systems ( TDSs) are assessed mainly in an offline setting or through human evaluation.
We propose a metaphorical user simulator for end-to-end TDS evaluation, where we define a simulator to be metaphorical if it simulates user's analogical thinking in interactions with systems.
We also propose a tester-based evaluation framework to generate variants, i.e., dialogue systems with different capabilities.
arXiv Detail & Related papers (2022-04-02T05:11:03Z) - Unified Transformer Tracker for Object Tracking [58.65901124158068]
We present the Unified Transformer Tracker (UTT) to address tracking problems in different scenarios with one paradigm.
A track transformer is developed in our UTT to track the target in both Single Object Tracking (SOT) and Multiple Object Tracking (MOT)
arXiv Detail & Related papers (2022-03-29T01:38:49Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - Variational Latent-State GPT for Semi-supervised Task-Oriented Dialog
Systems [24.667353107453824]
Variational Latent-State GPT model (VLS-GPT) is the first to combine the strengths of the two approaches.
We develop the strategy of sampling-then-forward-computation, which successfully overcomes the memory explosion issue of using GPT in variational learning.
VLS-GPT is shown to significantly outperform both supervised-only and semi-supervised baselines.
arXiv Detail & Related papers (2021-09-09T14:42:29Z) - Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward
Decomposition [64.06167416127386]
We propose Multi-Agent Dialog Policy Learning, which regards both the system and the user as the dialog agents.
Two agents interact with each other and are jointly learned simultaneously.
Results show that our method can successfully build a system policy and a user policy simultaneously.
arXiv Detail & Related papers (2020-04-08T04:51:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.