Related papers: CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues

CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues

URL: http://arxiv.org/abs/2404.03820v2
Date: Fri, 21 Jun 2024 13:57:11 GMT
Title: CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues
Authors: Makesh Narsimhan Sreedhar, Traian Rebedea, Shaona Ghosh, Jiaqi Zeng, Christopher Parisien,
Abstract summary: CantTalkAboutThis dataset consists of synthetic dialogues on a wide range of conversation topics from different domains. Fine-tuning language models on this dataset helps make them resilient to deviating from the role assigned. Preliminary observations suggest that training models on this dataset also enhance their performance on fine-grained instruction following tasks, including safety alignment.
Score: 4.427811636536821
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advancements in instruction-tuning datasets have predominantly focused on specific tasks like mathematical or logical reasoning. There has been a notable gap in data designed for aligning language models to maintain topic relevance in conversations - a critical aspect for deploying chatbots to production. We introduce the CantTalkAboutThis dataset to help language models remain focused on the subject at hand during task-oriented interactions. It consists of synthetic dialogues on a wide range of conversation topics from different domains. These dialogues are interspersed with distractor turns that intentionally divert the chatbot from the predefined topic. Fine-tuning language models on this dataset helps make them resilient to deviating from the role assigned and improves their ability to maintain topical coherence compared to general-purpose instruction-tuned LLMs like GPT-4-turbo and Mixtral-Instruct. Additionally, preliminary observations suggest that training models on this dataset also enhance their performance on fine-grained instruction following tasks, including safety alignment.

Related papers

Aligning Spoken Dialogue Models from User Interactions [55.192134724622235]
We propose a novel preference alignment framework to improve spoken dialogue models on realtime conversations from user interactions.<n>We create a dataset of more than 150,000 preference pairs from raw multi-turn speech conversations annotated with AI feedback.<n>Our findings shed light on the importance of a well-calibrated balance among various dynamics, crucial for natural real-time speech dialogue systems.
arXiv Detail & Related papers (2025-06-26T16:45:20Z)
Enhancing Talk Moves Analysis in Mathematics Tutoring through Classroom Teaching Discourse [6.1701318546149]
This paper focuses on analyzing mathematics tutoring discourse using talk moves. scaling the collection, annotation, and analysis of extensive tutoring dialogues to develop machine learning models is a challenging and resource-intensive task.
arXiv Detail & Related papers (2024-12-18T00:13:04Z)
MP2D: An Automated Topic Shift Dialogue Generation Framework Leveraging Knowledge Graphs [15.876075659237722]
Multi-Passage to Dialogue (MP2D) generates question-answering datasets with natural topic transitions. MP2D maps the flow of topics within a dialogue, effectively mirroring the dynamics of human conversation. This study introduces a novel benchmark for topic shift dialogues, TS-WikiDialog.
arXiv Detail & Related papers (2024-03-09T06:28:48Z)
SPECTRUM: Speaker-Enhanced Pre-Training for Long Dialogue Summarization [48.284512017469524]
Multi-turn dialogues are characterized by their extended length and the presence of turn-taking conversations. Traditional language models often overlook the distinct features of these dialogues by treating them as regular text. We propose a speaker-enhanced pre-training method for long dialogue summarization.
arXiv Detail & Related papers (2024-01-31T04:50:00Z)
Multi-Granularity Prompts for Topic Shift Detection in Dialogue [13.739991183173494]
The goal of dialogue topic shift detection is to identify whether the current topic in a conversation has changed or needs to change. Previous work focused on detecting topic shifts using pre-trained models to encode the utterance. We take a prompt-based approach to fully extract topic information from dialogues at multiple-granularity, i.e., label, turn, and topic.
arXiv Detail & Related papers (2023-05-23T12:35:49Z)
Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension [81.47133615169203]
We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs. We employ domain-adaptive training strategies to help the model adapt to the dialogue domains. Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
arXiv Detail & Related papers (2023-01-10T13:18:25Z)
Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning [27.92734269206744]
InstructDial is an instruction tuning framework for dialogue. It consists of a repository of 48 diverse dialogue tasks in a unified text-to-text format created from 59 openly available dialogue datasets. Our analysis reveals that InstructDial enables good zero-shot performance on unseen datasets and tasks such as dialogue evaluation and intent detection, and even better performance in a few-shot setting.
arXiv Detail & Related papers (2022-05-25T11:37:06Z)
TOD-DA: Towards Boosting the Robustness of Task-oriented Dialogue Modeling on Spoken Conversations [24.245354500835465]
We propose a novel model-agnostic data augmentation paradigm to boost the robustness of task-oriented dialogue modeling on spoken conversations. Our approach ranked first in both tasks of DSTC10 Track2, a benchmark for task-oriented dialogue modeling on spoken conversations.
arXiv Detail & Related papers (2021-12-23T10:04:25Z)
"How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken Conversations [87.95711406978157]
This work presents a new benchmark on spoken task-oriented conversations. We study multi-domain dialogue state tracking and knowledge-grounded dialogue modeling. Our data set enables speech-based benchmarking of task-oriented dialogue systems.
arXiv Detail & Related papers (2021-09-28T04:51:04Z)
Topic-Aware Multi-turn Dialogue Modeling [91.52820664879432]
This paper presents a novel solution for multi-turn dialogue modeling, which segments and extracts topic-aware utterances in an unsupervised way. Our topic-aware modeling is implemented by a newly proposed unsupervised topic-aware segmentation algorithm and Topic-Aware Dual-attention Matching (TADAM) Network.
arXiv Detail & Related papers (2020-09-26T08:43:06Z)
Structured Attention for Unsupervised Dialogue Structure Induction [110.12561786644122]
We propose to incorporate structured attention layers into a Variational Recurrent Neural Network (VRNN) model with discrete latent states to learn dialogue structure in an unsupervised fashion. Compared to a vanilla VRNN, structured attention enables a model to focus on different parts of the source sentence embeddings while enforcing a structural inductive bias.
arXiv Detail & Related papers (2020-09-17T23:07:03Z)
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling. To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.