HERALD: An Annotation Efficient Method to Detect User Disengagement in
Social Conversations
- URL: http://arxiv.org/abs/2106.00162v2
- Date: Wed, 2 Jun 2021 06:15:17 GMT
- Title: HERALD: An Annotation Efficient Method to Detect User Disengagement in
Social Conversations
- Authors: Weixin Liang, Kai-Hui Liang, Zhou Yu
- Abstract summary: Existing work on detecting user disengagement typically requires hand-labeling many dialog samples.
We propose HERALD, an efficient annotation framework that reframes the training data annotation process as a denoising problem.
Our experiments show that HERALD improves annotation efficiency significantly and achieves 86% user disengagement detection accuracy in two dialog corpora.
- Score: 38.95985439093335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open-domain dialog systems have a user-centric goal: to provide humans with
an engaging conversation experience. User engagement is one of the most
important metrics for evaluating open-domain dialog systems, and could also be
used as real-time feedback to benefit dialog policy learning. Existing work on
detecting user disengagement typically requires hand-labeling many dialog
samples. We propose HERALD, an efficient annotation framework that reframes the
training data annotation process as a denoising problem. Specifically, instead
of manually labeling training samples, we first use a set of labeling
heuristics to label training samples automatically. We then denoise the weakly
labeled data using the Shapley algorithm. Finally, we use the denoised data to
train a user engagement detector. Our experiments show that HERALD improves
annotation efficiency significantly and achieves 86% user disengagement
detection accuracy in two dialog corpora.
Related papers
- Multi-Action Dialog Policy Learning from Logged User Feedback [28.4271696269512]
Multi-action dialog policy generates multiple atomic dialog actions per turn.
Due to data limitations, existing policy models generalize poorly toward unseen dialog flows.
We propose BanditMatch to improve multi-action dialog policy learning with explicit and implicit turn-level user feedback.
arXiv Detail & Related papers (2023-02-27T04:01:28Z) - SPACE-2: Tree-Structured Semi-Supervised Contrastive Pre-training for
Task-Oriented Dialog Understanding [68.94808536012371]
We propose a tree-structured pre-trained conversation model, which learns dialog representations from limited labeled dialogs and large-scale unlabeled dialog corpora.
Our method can achieve new state-of-the-art results on the DialoGLUE benchmark consisting of seven datasets and four popular dialog understanding tasks.
arXiv Detail & Related papers (2022-09-14T13:42:50Z) - What is wrong with you?: Leveraging User Sentiment for Automatic Dialog
Evaluation [73.03318027164605]
We propose to use information that can be automatically extracted from the next user utterance as a proxy to measure the quality of the previous system response.
Our model generalizes across both spoken and written open-domain dialog corpora collected from real and paid users.
arXiv Detail & Related papers (2022-03-25T22:09:52Z) - A new data augmentation method for intent classification enhancement and
its application on spoken conversation datasets [23.495743195811375]
We present the Nearest Neighbors Scores Improvement (NNSI) algorithm for automatic data selection and labeling.
The NNSI reduces the need for manual labeling by automatically selecting highly-ambiguous samples and labeling them with high accuracy.
We demonstrated the use of NNSI on two large-scale, real-life voice conversation systems.
arXiv Detail & Related papers (2022-02-21T11:36:19Z) - User Response and Sentiment Prediction for Automatic Dialogue Evaluation [69.11124655437902]
We propose to use the sentiment of the next user utterance for turn or dialog level evaluation.
Experiments show our model outperforming existing automatic evaluation metrics on both written and spoken open-domain dialogue datasets.
arXiv Detail & Related papers (2021-11-16T22:19:17Z) - Data-Efficient Methods for Dialogue Systems [4.061135251278187]
Conversational User Interface (CUI) has become ubiquitous in everyday life, in consumer-focused products like Siri and Alexa.
Deep learning underlies many recent breakthroughs in dialogue systems but requires very large amounts of training data, often annotated by experts.
In this thesis, we introduce a series of methods for training robust dialogue systems from minimal data.
arXiv Detail & Related papers (2020-12-05T02:51:09Z) - Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for
Automatic Dialog Evaluation [69.03658685761538]
Open Domain dialog system evaluation is one of the most important challenges in dialog research.
We propose an automatic evaluation model CMADE that automatically cleans self-reported user ratings as it trains on them.
Our experiments show that CMADE achieves 89.2% accuracy in the dialog comparison task.
arXiv Detail & Related papers (2020-05-21T15:14:49Z) - Learning an Unreferenced Metric for Online Dialogue Evaluation [53.38078951628143]
We propose an unreferenced automated evaluation metric that uses large pre-trained language models to extract latent representations of utterances.
We show that our model achieves higher correlation with human annotations in an online setting, while not requiring true responses for comparison during inference.
arXiv Detail & Related papers (2020-05-01T20:01:39Z) - Learning Dialog Policies from Weak Demonstrations [32.149932955715705]
Building upon Deep Q-learning from Demonstrations (DQfD), we leverage dialog data to guide the agent to successfully respond to a user's requests.
We make progressively fewer assumptions about the data needed, using labeled, reduced-labeled, and even unlabeled data.
Experiments in a challenging multi-domain dialog system framework validate our approaches, and get high success rates even when trained on out-of-domain data.
arXiv Detail & Related papers (2020-04-23T10:22:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.