Semantic-Enhanced Explainable Finetuning for Open-Domain Dialogues
- URL: http://arxiv.org/abs/2106.03065v1
- Date: Sun, 6 Jun 2021 09:03:41 GMT
- Title: Semantic-Enhanced Explainable Finetuning for Open-Domain Dialogues
- Authors: Chen Henry Wu, Yinhe Zheng, Yida Wang, Zhenyu Yang, Minlie Huang
- Abstract summary: We propose to combine pretrained language models with the modular dialogue paradigm for open-domain dialogue modeling.
Our method, semantic-enhanced finetuning, instantiates conversation understanding, planning, and response generation as a language model finetuning task.
- Score: 33.50099424582726
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose to combine pretrained language models with the
modular dialogue paradigm for open-domain dialogue modeling. Our method,
semantic-enhanced finetuning, instantiates conversation understanding,
planning, and response generation as a language model finetuning task. At
inference, we disentangle semantic and token variations by specifying sampling
methods and constraints for each module separately. For training and
evaluation, we present X-Weibo, a Chinese multi-turn open-domain dialogue
dataset with automatic annotation for emotions, DAs, and topical words.
Experiments show that semantic-enhanced finetuning outperforms strong baselines
on non-semantic and semantic metrics, improves the human-evaluated relevance,
coherence, and informativeness, and exhibits considerable controllability over
semantic variables.
Related papers
- Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue [71.15186328127409]
Paralinguistics-enhanced Generative Pretrained Transformer (ParalinGPT)
Model takes the conversational context of text, speech embeddings, and paralinguistic attributes as input prompts within a serialized multitasking framework.
We utilize the Switchboard-1 corpus, including its sentiment labels as the paralinguistic attribute, as our spoken dialogue dataset.
arXiv Detail & Related papers (2023-12-23T18:14:56Z) - 'What are you referring to?' Evaluating the Ability of Multi-Modal
Dialogue Models to Process Clarificational Exchanges [65.03196674816772]
Referential ambiguities arise in dialogue when a referring expression does not uniquely identify the intended referent for the addressee.
Addressees usually detect such ambiguities immediately and work with the speaker to repair it using meta-communicative, Clarification Exchanges (CE): a Clarification Request (CR) and a response.
Here, we argue that the ability to generate and respond to CRs imposes specific constraints on the architecture and objective functions of multi-modal, visually grounded dialogue models.
arXiv Detail & Related papers (2023-07-28T13:44:33Z) - Evaluating Open-Domain Dialogues in Latent Space with Next Sentence
Prediction and Mutual Information [18.859159491548006]
We propose a novel learning-based automatic evaluation metric (CMN) for open-domain dialogues.
We employ Conditional Variational Autoencoders (CVAEs) with a Next Sentence Prediction (NSP) objective and employing Mutual Information (MI) to model the semantic similarity of text in the latent space.
Experimental results on two open-domain dialogue datasets demonstrate the superiority of our method compared with a wide range of baselines.
arXiv Detail & Related papers (2023-05-26T14:21:54Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - Controllable Mixed-Initiative Dialogue Generation through Prompting [50.03458333265885]
Mixed-initiative dialogue tasks involve repeated exchanges of information and conversational control.
Agents gain control by generating responses that follow particular dialogue intents or strategies, prescribed by a policy planner.
Standard approach has been fine-tuning pre-trained language models to perform generation conditioned on these intents.
We instead prompt large language models as a drop-in replacement to fine-tuning on conditional generation.
arXiv Detail & Related papers (2023-05-06T23:11:25Z) - Towards Generalized Models for Task-oriented Dialogue Modeling on Spoken
Conversations [22.894541507068933]
This paper presents our approach to build generalized models for the Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations Challenge of DSTC-10.
We employ extensive data augmentation strategies on written data, including artificial error injection and round-trip text-speech transformation.
Our approach ranks third on the objective evaluation and second on the final official human evaluation.
arXiv Detail & Related papers (2022-03-08T12:26:57Z) - Towards Transparent Interactive Semantic Parsing via Step-by-Step
Correction [17.000283696243564]
We investigate an interactive semantic parsing framework that explains the predicted logical form step by step in natural language.
We focus on question answering over knowledge bases (KBQA) as an instantiation of our framework.
Our experiments show that the interactive framework with human feedback has the potential to greatly improve overall parse accuracy.
arXiv Detail & Related papers (2021-10-15T20:11:22Z) - Improving Multi-Party Dialogue Discourse Parsing via Domain Integration [25.805553277418813]
Multi-party conversations are implicitly organized by semantic level correlations across the interactive turns.
dialogue discourse analysis can be applied to predict the dependency structure and relations between the elementary discourse units.
Existing corpora with dialogue discourse annotation are collected from specific domains with limited sample sizes.
arXiv Detail & Related papers (2021-10-09T09:36:22Z) - I like fish, especially dolphins: Addressing Contradictions in Dialogue
Modeling [104.09033240889106]
We introduce the DialoguE COntradiction DEtection task (DECODE) and a new conversational dataset containing both human-human and human-bot contradictory dialogues.
We then compare a structured utterance-based approach of using pre-trained Transformer models for contradiction detection with the typical unstructured approach.
arXiv Detail & Related papers (2020-12-24T18:47:49Z) - An Empirical Investigation of Pre-Trained Transformer Language Models
for Open-Domain Dialogue Generation [23.343006562849126]
We present an empirical investigation of pre-trained Transformer-based auto-regressive language models for the task of open-domain dialogue generation.
Training paradigm of pre-training and fine-tuning is employed to conduct learning.
Experiments are conducted on the typical single-turn and multi-turn dialogue corpora such as Weibo, Douban, Reddit, DailyDialog, and Persona-Chat.
arXiv Detail & Related papers (2020-03-09T15:20:21Z) - Variational Hierarchical Dialog Autoencoder for Dialog State Tracking
Data Augmentation [59.174903564894954]
In this work, we extend this approach to the task of dialog state tracking for goal-oriented dialogs.
We propose the Variational Hierarchical Dialog Autoencoder (VHDA) for modeling the complete aspects of goal-oriented dialogs.
Experiments on various dialog datasets show that our model improves the downstream dialog trackers' robustness via generative data augmentation.
arXiv Detail & Related papers (2020-01-23T15:34:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.