A Brief Study on the Effects of Training Generative Dialogue Models with
a Semantic loss
- URL: http://arxiv.org/abs/2106.10619v1
- Date: Sun, 20 Jun 2021 04:39:29 GMT
- Title: A Brief Study on the Effects of Training Generative Dialogue Models with
a Semantic loss
- Authors: Prasanna Parthasarathi, Mohamed Abdelsalam, Joelle Pineau, Sarath
Chandar
- Abstract summary: We study the effects of minimizing an alternate training objective that fosters a model to generate alternate response and score it on semantic similarity.
We explore this idea on two different sized data sets on the task of next utterance generation in goal oriented dialogues.
- Score: 37.8626106992769
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural models trained for next utterance generation in dialogue task learn to
mimic the n-gram sequences in the training set with training objectives like
negative log-likelihood (NLL) or cross-entropy. Such commonly used training
objectives do not foster generating alternate responses to a context. But, the
effects of minimizing an alternate training objective that fosters a model to
generate alternate response and score it on semantic similarity has not been
well studied. We hypothesize that a language generation model can improve on
its diversity by learning to generate alternate text during training and
minimizing a semantic loss as an auxiliary objective. We explore this idea on
two different sized data sets on the task of next utterance generation in goal
oriented dialogues. We make two observations (1) minimizing a semantic
objective improved diversity in responses in the smaller data set (Frames) but
only as-good-as minimizing the NLL in the larger data set (MultiWoZ) (2) large
language model embeddings can be more useful as a semantic loss objective than
as initialization for token embeddings.
Related papers
- Unified Generative and Discriminative Training for Multi-modal Large Language Models [88.84491005030316]
Generative training has enabled Vision-Language Models (VLMs) to tackle various complex tasks.
Discriminative training, exemplified by models like CLIP, excels in zero-shot image-text classification and retrieval.
This paper proposes a unified approach that integrates the strengths of both paradigms.
arXiv Detail & Related papers (2024-11-01T01:51:31Z) - Integrating Self-supervised Speech Model with Pseudo Word-level Targets
from Visually-grounded Speech Model [57.78191634042409]
We propose Pseudo-Word HuBERT (PW-HuBERT), a framework that integrates pseudo word-level targets into the training process.
Our experimental results on four spoken language understanding (SLU) benchmarks suggest the superiority of our model in capturing semantic information.
arXiv Detail & Related papers (2024-02-08T16:55:21Z) - Forging Multiple Training Objectives for Pre-trained Language Models via
Meta-Learning [97.28779163988833]
Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective language modeling.
We propose textitMOMETAS, a novel adaptive sampler based on meta-learning, which learns the latent sampling pattern on arbitrary pre-training objectives.
arXiv Detail & Related papers (2022-10-19T04:38:26Z) - A Simple Contrastive Learning Objective for Alleviating Neural Text
Degeneration [56.64703901898937]
We propose a new contrastive token learning objective that inherits the advantages of cross-entropy and unlikelihood training.
Comprehensive experiments on language modeling and open-domain dialogue generation tasks show that the proposed contrastive token objective yields less repetitive texts.
arXiv Detail & Related papers (2022-05-05T08:50:50Z) - SDCUP: Schema Dependency-Enhanced Curriculum Pre-Training for Table
Semantic Parsing [19.779493883522072]
This paper designs two novel pre-training objectives to impose the desired inductive bias into the learned representations for table pre-training.
We propose a schema-aware curriculum learning approach to mitigate the impact of noise and learn effectively from the pre-training data in an easy-to-hard manner.
arXiv Detail & Related papers (2021-11-18T02:51:04Z) - Enhancing Dialogue Generation via Multi-Level Contrastive Learning [57.005432249952406]
We propose a multi-level contrastive learning paradigm to model the fine-grained quality of the responses with respect to the query.
A Rank-aware (RC) network is designed to construct the multi-level contrastive optimization objectives.
We build a Knowledge Inference (KI) component to capture the keyword knowledge from the reference during training and exploit such information to encourage the generation of informative words.
arXiv Detail & Related papers (2020-09-19T02:41:04Z) - Group-wise Contrastive Learning for Neural Dialogue Generation [29.749195182401344]
We introduce contrastive learning into dialogue generation, where the model explicitly perceives the difference between the well-chosen positive and negative utterances.
To manage the multi-mapping relations prevailed in human conversation, we augment contrastive dialogue learning with group-wise dual sampling.
arXiv Detail & Related papers (2020-09-16T08:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.