Related papers: Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models

Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models

URL: http://arxiv.org/abs/2404.14772v1
Date: Tue, 23 Apr 2024 06:23:34 GMT
Title: Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models
Authors: Chris Samarinas, Pracha Promthaw, Atharva Nijasure, Hansi Zeng, Julian Killingback, Hamed Zamani,
Abstract summary: SynTOD is a new synthetic data generation approach for developing end-to-end Task-Oriented Dialogue (TOD) systems. It generates diverse, structured conversations through random walks and response simulation using large language models. In our experiments, using graph-guided response simulations leads to significant improvements in intent classification, slot filling and response relevance.
Score: 16.94819621353007
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper explores SynTOD, a new synthetic data generation approach for developing end-to-end Task-Oriented Dialogue (TOD) Systems capable of handling complex tasks such as intent classification, slot filling, conversational question-answering, and retrieval-augmented response generation, without relying on crowdsourcing or real-world data. SynTOD utilizes a state transition graph to define the desired behavior of a TOD system and generates diverse, structured conversations through random walks and response simulation using large language models (LLMs). In our experiments, using graph-guided response simulations leads to significant improvements in intent classification, slot filling and response relevance compared to naive single-prompt simulated conversations. We also investigate the end-to-end TOD effectiveness of different base and instruction-tuned LLMs, with and without the constructed synthetic conversations. Finally, we explore how various LLMs can evaluate responses in a TOD system and how well they are correlated with human judgments. Our findings pave the path towards quick development and evaluation of domain-specific TOD systems. We release our datasets, models, and code for research purposes.

Related papers

UniConv: Unifying Retrieval and Response Generation for Large Language Models in Conversations [71.79210031338464]
We show how to unify dense retrieval and response generation for large language models in conversation.<n>We conduct joint fine-tuning with different objectives and design two mechanisms to reduce the inconsistency risks.<n>The evaluations on five conversational search datasets demonstrate that our unified model can mutually improve both tasks and outperform the existing baselines.
arXiv Detail & Related papers (2025-07-09T17:02:40Z)
Commonsense Generation and Evaluation for Dialogue Systems using Large Language Models [8.556799193001341]
This paper explores the task of performing turn-level data augmentation for dialogue system based on different types of commonsense relationships.<n>The proposed methodology takes advantage of the extended knowledge and zero-shot capabilities of pretrained Large Language Models (LLMs) to follow instructions.<n>Preliminary results suggest that our approach effectively harnesses LLMs capabilities for commonsense reasoning and evaluation in dialogue systems.
arXiv Detail & Related papers (2025-06-24T10:18:05Z)
SDialog: A Python Toolkit for Synthetic Dialogue Generation and Analysis [0.7919810878571298]
SDialog is a modular, realistic Python toolkit designed to address the challenges of synthetic dialogue generation and analysis.<n>By leveraging instruction-tuned Large Language Models (LLMs), SDialog provides abstractions for personas, orchestration, and scenario management.
arXiv Detail & Related papers (2025-06-12T12:07:51Z)
Generative Induction of Dialogue Task Schemas with Streaming Refinement and Simulated Interactions [10.781063445675423]
Slot Induction (SSI) is essential for automatically identifying key information slots from dialogue data without manual intervention. This paper presents a novel state-of-the-art (SoTA) approach that formulates SSI as a text generation task.
arXiv Detail & Related papers (2025-04-25T16:29:45Z)
Dynamic benchmarking framework for LLM-based conversational data capture [0.0]
This paper introduces a benchmarking framework to assess large language models (LLMs) It integrates generative agent simulation to evaluate performance on key dimensions: information extraction, context awareness, and adaptive engagement. Results show that adaptive strategies improve data extraction accuracy, especially when handling ambiguous responses.
arXiv Detail & Related papers (2025-02-04T15:47:47Z)
DiaSynth: Synthetic Dialogue Generation Framework for Low Resource Dialogue Applications [18.378069426713]
Existing research is constrained by general or niche datasets that lack sufficient scale for training dialogue systems. We introduce Dia Synth - a synthetic dialogue generation framework capable of generating high-quality, contextually rich dialogues. We perform our experiments by generating synthetic data using different LLMs and few-shot examples from DialogSum and SAMSum.
arXiv Detail & Related papers (2024-09-25T07:03:31Z)
LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments [70.91258869156353]
We introduce LangSuitE, a versatile and simulation-free testbed featuring 6 representative embodied tasks in textual embodied worlds. Compared with previous LLM-based testbeds, LangSuitE offers adaptability to diverse environments without multiple simulation engines. We devise a novel chain-of-thought (CoT) schema, EmMem, which summarizes embodied states w.r.t. history information.
arXiv Detail & Related papers (2024-06-24T03:36:29Z)
Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems [2.788542465279969]
This paper introduces DAUS, a Domain-Aware User Simulator. We fine-tune DAUS on real examples of task-oriented dialogues. Results on two relevant benchmarks showcase significant improvements in terms of user goal fulfillment.
arXiv Detail & Related papers (2024-02-20T20:57:47Z)
TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles [27.05310753976961]
We introduce Task-Oriented Automatic Dialogs (TOAD), a novel and scalable TOD dataset. The TOAD dataset simulates realistic app context interaction and provides a variety of system response style options. We benchmark TOAD on two response generation tasks, and the results show that modeling more verbose responses or responses without user expression mirroring is more challenging.
arXiv Detail & Related papers (2024-02-15T17:40:02Z)
TOD-Flow: Modeling the Structure of Task-Oriented Dialogues [77.15457469745364]
We propose a novel approach focusing on inferring the TOD-Flow graph from dialogue data annotated with dialog acts. The inferred TOD-Flow graph can be easily integrated with any dialogue model to improve its prediction performance, transparency, and controllability.
arXiv Detail & Related papers (2023-12-07T20:06:23Z)
User Simulation with Large Language Models for Evaluating Task-Oriented Dialogue [10.336443286833145]
We propose a novel user simulator built using recently developed large pretrained language models (LLMs) Unlike previous work, which sought to maximize goal success rate (GSR) as the primary metric of simulator performance, our goal is a system which achieves a GSR similar to that observed in human interactions with TOD systems.
arXiv Detail & Related papers (2023-09-23T02:04:57Z)
PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems [59.1250765143521]
Current knowledge-grounded dialogue systems often fail to align the generated responses with human-preferred qualities. We propose Polished & Informed Candidate Scoring (PICK), a generation re-scoring framework. We demonstrate the effectiveness of PICK in generating responses that are more faithful while keeping them relevant to the dialogue history.
arXiv Detail & Related papers (2023-09-19T08:27:09Z)
Using Textual Interface to Align External Knowledge for End-to-End Task-Oriented Dialogue Systems [53.38517204698343]
We propose a novel paradigm that uses a textual interface to align external knowledge and eliminate redundant processes. We demonstrate our paradigm in practice through MultiWOZ-Remake, including an interactive textual interface built for the MultiWOZ database.
arXiv Detail & Related papers (2023-05-23T05:48:21Z)
Is MultiWOZ a Solved Task? An Interactive TOD Evaluation Framework with User Simulator [37.590563896382456]
We propose an interactive evaluation framework for Task-Oriented Dialogue (TOD) systems. We first build a goal-oriented user simulator based on pre-trained models and then use the user simulator to interact with the dialogue system to generate dialogues. Experimental results show that RL-based TOD systems trained by our proposed user simulator can achieve nearly 98% inform and success rates.
arXiv Detail & Related papers (2022-10-26T07:41:32Z)
Collaborative Reasoning on Multi-Modal Semantic Graphs for Video-Grounded Dialogue Generation [53.87485260058957]
We study video-grounded dialogue generation, where a response is generated based on the dialogue context and the associated video. The primary challenges of this task lie in (1) the difficulty of integrating video data into pre-trained language models (PLMs) We propose a multi-agent reinforcement learning method to collaboratively perform reasoning on different modalities.
arXiv Detail & Related papers (2022-10-22T14:45:29Z)
Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination. We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner. Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.