Related papers: SYNDICOM: Improving Conversational Commonsense with Error-Injection and Natural Language Feedback

SYNDICOM: Improving Conversational Commonsense with Error-Injection and Natural Language Feedback

URL: http://arxiv.org/abs/2309.10015v1
Date: Mon, 18 Sep 2023 15:08:48 GMT
Title: SYNDICOM: Improving Conversational Commonsense with Error-Injection and Natural Language Feedback
Authors: Christopher Richardson, Anirudh Sundar, Larry Heck
Abstract summary: We introduce SYNDICOM - a method for improving commonsense in dialogue response generation. The first component is a dataset composed of commonsense dialogues created from a knowledge graph and synthesized into natural language. The second contribution is a two-step procedure: training a model to predict natural language feedback (NLF) for invalid responses, and then training a response generation model conditioned on the predicted NLF.
Score: 3.642278451851518
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Commonsense reasoning is a critical aspect of human communication. Despite recent advances in conversational AI driven by large language models, commonsense reasoning remains a challenging task. In this work, we introduce SYNDICOM - a method for improving commonsense in dialogue response generation. SYNDICOM consists of two components. The first component is a dataset composed of commonsense dialogues created from a knowledge graph and synthesized into natural language. This dataset includes both valid and invalid responses to dialogue contexts, along with natural language feedback (NLF) for the invalid responses. The second contribution is a two-step procedure: training a model to predict natural language feedback (NLF) for invalid responses, and then training a response generation model conditioned on the predicted NLF, the invalid response, and the dialogue. SYNDICOM is scalable and does not require reinforcement learning. Empirical results on three tasks are evaluated using a broad range of metrics. SYNDICOM achieves a relative improvement of 53% over ChatGPT on ROUGE1, and human evaluators prefer SYNDICOM over ChatGPT 57% of the time. We will publicly release the code and the full dataset.

Related papers

REALTALK: A 21-Day Real-World Dataset for Long-Term Conversation [51.97224538045096]
We introduce REALTALK, a 21-day corpus of authentic messaging app dialogues. We compare EI attributes and persona consistency to understand the challenges posed by real-world dialogues. Our findings reveal that models struggle to simulate a user solely from dialogue history, while fine-tuning on specific user chats improves persona emulation.
arXiv Detail & Related papers (2025-02-18T20:29:01Z)
CoPrUS: Consistency Preserving Utterance Synthesis towards more realistic benchmark dialogues [0.27309692684728604]
We investigate the creation of synthetic communication errors in an automatic pipeline. We focus on three types of miscommunications that could happen in real-world dialogues but are underrepresented in the benchmark dataset. Our two-step approach uses a state-of-the-art Large Language Model (LLM) to first create the error and secondly the repairing utterance.
arXiv Detail & Related papers (2024-12-10T13:51:55Z)
Self-Directed Turing Test for Large Language Models [56.64615470513102]
The Turing test examines whether AIs can exhibit human-like behaviour in natural language conversations. Traditional Turing tests adopt a rigid dialogue format where each participant sends only one message each time. This paper proposes the Self-Directed Turing Test, which extends the original test with a burst dialogue format.
arXiv Detail & Related papers (2024-08-19T09:57:28Z)
Learning From Free-Text Human Feedback -- Collect New Datasets Or Extend Existing Ones? [57.16050211534735]
We investigate the types and frequency of free-text human feedback in commonly used dialog datasets. Our findings provide new insights into the composition of the datasets examined, including error types, user response types, and the relations between them.
arXiv Detail & Related papers (2023-10-24T12:01:11Z)
Training Language Models with Language Feedback at Scale [50.70091340506957]
We introduce learning from Language Feedback (ILF), a new approach that utilizes more informative language feedback. ILF consists of three steps that are applied iteratively: first, conditioning the language model on the input, an initial LM output, and feedback to generate refinements. We show theoretically that ILF can be viewed as Bayesian Inference, similar to Reinforcement Learning from human feedback.
arXiv Detail & Related papers (2023-03-28T17:04:15Z)
Evaluating Human-Language Model Interaction [79.33022878034627]
We develop a new framework, Human-AI Language-based Interaction Evaluation (HALIE), that defines the components of interactive systems. We design five tasks to cover different forms of interaction: social dialogue, question answering, crossword puzzles, summarization, and metaphor generation. We find that better non-interactive performance does not always translate to better human-LM interaction.
arXiv Detail & Related papers (2022-12-19T18:59:45Z)
OPAL: Ontology-Aware Pretrained Language Model for End-to-End Task-Oriented Dialogue [40.62090743056549]
This paper presents an ontology-aware pretrained language model (OPAL) for end-to-end task-oriented dialogue (TOD) Unlike chit-chat dialogue models, task-oriented dialogue models fulfill at least two task-specific modules: dialogue state tracker (DST) and response generator (RG)
arXiv Detail & Related papers (2022-09-10T04:38:27Z)
DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation [80.45816053153722]
DialogVED introduces continuous latent variables into the enhanced encoder-decoder pre-training framework to increase the relevance and diversity of responses. We conduct experiments on PersonaChat, DailyDialog, and DSTC7-AVSD benchmarks for response generation.
arXiv Detail & Related papers (2022-04-27T16:18:15Z)
Commonsense-Focused Dialogues for Response Generation: An Empirical Study [39.49727190159279]
We present an empirical study of commonsense in dialogue response generation. We first auto-extract commonsensical dialogues from existing dialogue datasets by leveraging ConceptNet. We then collect a new dialogue dataset with 25K dialogues aimed at exhibiting social commonsense in an interactive setting.
arXiv Detail & Related papers (2021-09-14T04:32:09Z)
A Taxonomy of Empathetic Response Intents in Human Social Conversations [1.52292571922932]
Open-domain conversational agents are becoming increasingly popular in the natural language processing community. One of the challenges is enabling them to converse in an empathetic manner. Current neural response generation methods rely solely on end-to-end learning from large scale conversation data to generate dialogues. Recent work has shown the promise of combining dialogue act/intent modelling and neural response generation.
arXiv Detail & Related papers (2020-12-07T21:56:45Z)
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling. To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.