Related papers: Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue

Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue

URL: http://arxiv.org/abs/2402.06967v2
Date: Thu, 30 May 2024 04:57:36 GMT
Title: Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue
Authors: Jian Wang, Chak Tou Leong, Jiashuo Wang, Dongding Lin, Wenjie Li, Xiao-Yong Wei,
Abstract summary: We propose an efficient Multi-round Interactive Dialogue Tuning (Midi-Tuning) framework. It models the agent and user individually with two adapters built upon large language models. Our framework performs superior to traditional fine-tuning and harbors the tremendous potential for improving dialogue consistency.
Score: 13.774377524019723
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Tuning language models for dialogue generation has been a prevalent paradigm for building capable dialogue agents. Yet, traditional tuning narrowly views dialogue generation as resembling other language generation tasks, ignoring the role disparities between two speakers and the multi-round interactive process that dialogues ought to be. Such a manner often leads to unsatisfactory chat consistency for the built agent. In this work, we emphasize the interactive, communicative nature of dialogue and argue that it is more feasible to model the speaker roles of agent and user separately, enabling the agent to adhere to its role consistently. With this in mind, we propose an efficient Multi-round Interactive Dialogue Tuning (Midi-Tuning) framework. It models the agent and user individually with two adapters built upon large language models. The adapters make use of respective utterances round by round in alternating order and they are tuned via a round-level memory caching mechanism. Extensive experiments demonstrate that, our framework performs superior to traditional fine-tuning and harbors the tremendous potential for improving dialogue consistency.

Related papers

Aligning Spoken Dialogue Models from User Interactions [55.192134724622235]
We propose a novel preference alignment framework to improve spoken dialogue models on realtime conversations from user interactions.<n>We create a dataset of more than 150,000 preference pairs from raw multi-turn speech conversations annotated with AI feedback.<n>Our findings shed light on the importance of a well-calibrated balance among various dynamics, crucial for natural real-time speech dialogue systems.
arXiv Detail & Related papers (2025-06-26T16:45:20Z)
Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities [93.09944267871163]
FullDuplexBench is a benchmark that systematically evaluates key conversational behaviors. We aim to advance spoken dialogue modeling and encourage the development of more interactive and natural dialogue systems.
arXiv Detail & Related papers (2025-03-06T18:59:16Z)
Multi-User MultiWOZ: Task-Oriented Dialogues among Multiple Users [51.34484827552774]
We release the Multi-User MultiWOZ dataset: task-oriented dialogues among two users and one agent. These dialogues reflect interesting dynamics of collaborative decision-making in task-oriented scenarios. We propose a novel task of multi-user contextual query rewriting: to rewrite a task-oriented chat between two users as a concise task-oriented query.
arXiv Detail & Related papers (2023-10-31T14:12:07Z)
Contextual Data Augmentation for Task-Oriented Dialog Systems [8.085645180329417]
We develop a novel dialog augmentation model that generates a user turn, conditioning on full dialog context. With a new prompt design for language model, and output re-ranking, the dialogs generated from our model can be directly used to train downstream dialog systems.
arXiv Detail & Related papers (2023-10-16T13:22:34Z)
Towards human-like spoken dialogue generation between AI agents from written dialogue [8.4989907582951]
This study proposes CHATS - CHatty Agents Text-to-Speech - a discrete token-based system designed to generate spoken dialogues based on written dialogues. Our system can generate speech for both the speaker side and the listener side simultaneously, using only the transcription from the speaker side.
arXiv Detail & Related papers (2023-10-02T11:03:20Z)
Unified Conversational Models with System-Initiated Transitions between Chit-Chat and Task-Oriented Dialogues [4.714297769572548]
We investigate the potential initiative'' that occurs when there is a change between dialogue modes in one dialogue. We contribute two efficient prompt models which can proactively generate a transition sentence to trigger system-initiated transitions.
arXiv Detail & Related papers (2023-07-04T11:53:23Z)
Revisiting Conversation Discourse for Dialogue Disentanglement [88.3386821205896]
We propose enhancing dialogue disentanglement by taking full advantage of the dialogue discourse characteristics. We develop a structure-aware framework to integrate the rich structural features for better modeling the conversational semantic context. Our work has great potential to facilitate broader multi-party multi-thread dialogue applications.
arXiv Detail & Related papers (2023-06-06T19:17:47Z)
Dialog act guided contextual adapter for personalized speech recognition [9.672512327395435]
Personalization in multi-turn dialogs has been a long standing challenge for end-to-end automatic speech recognition (E2E ASR) models. Recent work on contextual adapters has tackled rare word recognition using user catalogs. We propose a dialog act guided contextual adapter network.
arXiv Detail & Related papers (2023-03-31T05:13:44Z)
Manual-Guided Dialogue for Flexible Conversational Agents [84.46598430403886]
How to build and use dialogue data efficiently, and how to deploy models in different domains at scale can be critical issues in building a task-oriented dialogue system. We propose a novel manual-guided dialogue scheme, where the agent learns the tasks from both dialogue and manuals. Our proposed scheme reduces the dependence of dialogue models on fine-grained domain ontology, and makes them more flexible to adapt to various domains.
arXiv Detail & Related papers (2022-08-16T08:21:12Z)
Filling the Gap of Utterance-aware and Speaker-aware Representation for Multi-turn Dialogue [76.88174667929665]
A multi-turn dialogue is composed of multiple utterances from two or more different speaker roles. In the existing retrieval-based multi-turn dialogue modeling, the pre-trained language models (PrLMs) as encoder represent the dialogues coarsely. We propose a novel model to fill such a gap by modeling the effective utterance-aware and speaker-aware representations entailed in a dialogue history.
arXiv Detail & Related papers (2020-09-14T15:07:19Z)
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling. To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z)
Conversation Learner -- A Machine Teaching Tool for Building Dialog Managers for Task-Oriented Dialog Systems [57.082447660944965]
Conversation Learner is a machine teaching tool for building dialog managers. It enables dialog authors to create a dialog flow using familiar tools, converting the dialog flow into a parametric model. It allows dialog authors to improve the dialog manager over time by leveraging user-system dialog logs as training data.
arXiv Detail & Related papers (2020-04-09T00:10:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.