Related papers: AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies

AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies

URL: http://arxiv.org/abs/2211.12615v1
Date: Tue, 22 Nov 2022 22:31:34 GMT
Title: AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies
Authors: Weiyan Shi, Emily Dinan, Adi Renduchintala, Daniel Fried, Athul Paul Jacob, Zhou Yu, Mike Lewis
Abstract summary: We show that dialogue models can detect errors in their own messages introspectively, by calculating the likelihood of replies that are indicative of poor messages. We first show that hand-crafted replies can be effective for the task of detecting nonsense in applications as complex as Diplomacy. We find that AutoReply-generated replies outperform handcrafted replies and perform on par with carefully fine-tuned large supervised models.
Score: 71.62832112141913
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing approaches built separate classifiers to detect nonsense in dialogues. In this paper, we show that without external classifiers, dialogue models can detect errors in their own messages introspectively, by calculating the likelihood of replies that are indicative of poor messages. For example, if an agent believes its partner is likely to respond "I don't understand" to a candidate message, that message may not make sense, so an alternative message should be chosen. We evaluate our approach on a dataset from the game Diplomacy, which contains long dialogues richly grounded in the game state, on which existing models make many errors. We first show that hand-crafted replies can be effective for the task of detecting nonsense in applications as complex as Diplomacy. We then design AutoReply, an algorithm to search for such discriminative replies automatically, given a small number of annotated dialogue examples. We find that AutoReply-generated replies outperform handcrafted replies and perform on par with carefully fine-tuned large supervised models. Results also show that one single reply without much computation overheads can also detect dialogue nonsense reasonably well.

Related papers

Prompting for a conversation: How to control a dialog model? [9.268682116424518]
Dialog models are trained on a large amount of text, yet their responses need to be limited to a desired scope and style of a dialog agent. Because the datasets used to achieve the former contain language that is not compatible with the latter, pre-trained dialog models are fine-tuned on smaller curated datasets. In this paper we investigate if prompting can mitigate the above trade-off.
arXiv Detail & Related papers (2022-09-22T14:59:55Z)
What is wrong with you?: Leveraging User Sentiment for Automatic Dialog Evaluation [73.03318027164605]
We propose to use information that can be automatically extracted from the next user utterance as a proxy to measure the quality of the previous system response. Our model generalizes across both spoken and written open-domain dialog corpora collected from real and paid users.
arXiv Detail & Related papers (2022-03-25T22:09:52Z)
WeaSuL: Weakly Supervised Dialogue Policy Learning: Reward Estimation for Multi-turn Dialogue [17.663449579168297]
We simulate a dialogue between an agent and a user (modelled similar to an agent with supervised learning objective) to interact with each other. The agent uses dynamic blocking to generate ranked diverse responses and exploration-exploitation to select among the Top-K responses. Empirical studies with two benchmarks indicate that our model can significantly out-perform the response quality and lead to a successful conversation.
arXiv Detail & Related papers (2021-08-01T08:00:45Z)
Generating Dialogue Responses from a Semantic Latent Space [75.18449428414736]
We propose an alternative to the end-to-end classification on vocabulary. We learn the pair relationship between the prompts and responses as a regression task on a latent space. Human evaluation showed that learning the task on a continuous space can generate responses that are both relevant and informative.
arXiv Detail & Related papers (2020-10-04T19:06:16Z)
Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination. We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner. Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)
Controlling Dialogue Generation with Semantic Exemplars [55.460082747572734]
We present an Exemplar-based Dialogue Generation model, EDGE, that uses the semantic frames present in exemplar responses to guide generation. We show that controlling dialogue generation based on the semantic frames of exemplars, rather than words in the exemplar itself, improves the coherence of generated responses.
arXiv Detail & Related papers (2020-08-20T17:02:37Z)
Speaker Sensitive Response Evaluation Model [17.381658875470638]
We propose an automatic evaluation model based on the similarity of the generated response with the conversational context. We learn the model parameters from an unlabeled conversation corpus. We show that our model can be applied to movie dialogues without any additional training.
arXiv Detail & Related papers (2020-06-12T08:59:10Z)
Learning an Unreferenced Metric for Online Dialogue Evaluation [53.38078951628143]
We propose an unreferenced automated evaluation metric that uses large pre-trained language models to extract latent representations of utterances. We show that our model achieves higher correlation with human annotations in an online setting, while not requiring true responses for comparison during inference.
arXiv Detail & Related papers (2020-05-01T20:01:39Z)
"Wait, I'm Still Talking!" Predicting the Dialogue Interaction Behavior Using Imagine-Then-Arbitrate Model [24.560203199376478]
In real human-human conversations, human often sequentially sends several short messages for readability instead of a long message in one turn. We propose a novel Imagine-then-Arbitrate (ITA) neural dialogue model to help the agent decide whether to wait or to make a response directly.
arXiv Detail & Related papers (2020-02-22T04:05:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.