Diversifying Neural Dialogue Generation via Negative Distillation
- URL: http://arxiv.org/abs/2205.02795v1
- Date: Thu, 5 May 2022 17:14:56 GMT
- Title: Diversifying Neural Dialogue Generation via Negative Distillation
- Authors: Yiwei Li, Shaoxiong Feng, Bin Sun, Kan Li
- Abstract summary: We propose a novel negative training paradigm, called negative distillation, to keep the model away from the undesirable generic responses.
Empirical results show that our method outperforms previous negative training methods significantly.
- Score: 11.124375734351826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative dialogue models suffer badly from the generic response problem,
limiting their applications to a few toy scenarios. Recently, an interesting
approach, namely negative training, has been proposed to alleviate this problem
by reminding the model not to generate high-frequency responses during
training. However, its performance is hindered by two issues, ignoring
low-frequency but generic responses and bringing low-frequency but meaningless
responses. In this paper, we propose a novel negative training paradigm, called
negative distillation, to keep the model away from the undesirable generic
responses while avoiding the above problems. First, we introduce a negative
teacher model that can produce query-wise generic responses, and then the
student model is required to maximize the distance with multi-level negative
knowledge. Empirical results show that our method outperforms previous negative
training methods significantly.
Related papers
- Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models [2.0962367975513496]
Machine unlearning aims to efficiently eliminate the influence of specific training data, known as the forget set, from the model.
Existing unlearning methods rely solely on negative feedback to suppress responses related to the forget set.
We propose a novel approach called Alternate Preference Optimization (AltPO), which combines negative feedback with in-domain positive feedback on the forget set.
arXiv Detail & Related papers (2024-09-20T13:05:07Z) - Promoting Open-domain Dialogue Generation through Learning Pattern
Information between Contexts and Responses [5.936682548344234]
This paper improves the quality of generated responses by learning the implicit pattern information between contexts and responses in the training samples.
We also design a response-aware mechanism for mining the implicit pattern information between contexts and responses so that the generated replies are more diverse and approximate to human replies.
arXiv Detail & Related papers (2023-09-06T08:11:39Z) - Pneg: Prompt-based Negative Response Generation for Dialogue Response
Selection Task [27.513992470527427]
In retrieval-based dialogue systems, a response selection model acts as a ranker to select the most appropriate response among several candidates.
Recent studies have shown that leveraging adversarial responses as negative training samples is useful for improving the discriminating power of the selection model.
This paper proposes a simple but efficient method for generating adversarial negative responses leveraging a large-scale language model.
arXiv Detail & Related papers (2022-10-31T11:49:49Z) - Generating Negative Samples for Sequential Recommendation [83.60655196391855]
We propose to Generate Negative Samples (items) for Sequential Recommendation (SR)
A negative item is sampled at each time step based on the current SR model's learned user preferences toward items.
Experiments on four public datasets verify the importance of providing high-quality negative samples for SR.
arXiv Detail & Related papers (2022-08-07T05:44:13Z) - Self-Normalized Importance Sampling for Neural Language Modeling [97.96857871187052]
In this work, we propose self-normalized importance sampling. Compared to our previous work, the criteria considered in this work are self-normalized and there is no need to further conduct a correction step.
We show that our proposed self-normalized importance sampling is competitive in both research-oriented and production-oriented automatic speech recognition tasks.
arXiv Detail & Related papers (2021-11-11T16:57:53Z) - Challenging Instances are Worth Learning: Generating Valuable Negative
Samples for Response Selection Training [16.34984384383166]
A response selection module is usually trained on the annotated positive response and sampled negative responses.
We employ pre-trained language models to construct more challenging negative instances to enhance the model robustness.
Our method brings significant and stable improvements on the dialogue response selection capacity.
arXiv Detail & Related papers (2021-09-14T09:16:24Z) - Social NCE: Contrastive Learning of Socially-aware Motion
Representations [87.82126838588279]
Experimental results show that the proposed method dramatically reduces the collision rates of recent trajectory forecasting, behavioral cloning and reinforcement learning algorithms.
Our method makes few assumptions about neural architecture designs, and hence can be used as a generic way to promote the robustness of neural motion models.
arXiv Detail & Related papers (2020-12-21T22:25:06Z) - Positive-Congruent Training: Towards Regression-Free Model Updates [87.25247195148187]
In image classification, sample-wise inconsistencies appear as "negative flips"
A new model incorrectly predicts the output for a test sample that was correctly classified by the old (reference) model.
We propose a simple approach for PC training, Focal Distillation, which enforces congruence with the reference model.
arXiv Detail & Related papers (2020-11-18T09:00:44Z) - Group-wise Contrastive Learning for Neural Dialogue Generation [29.749195182401344]
We introduce contrastive learning into dialogue generation, where the model explicitly perceives the difference between the well-chosen positive and negative utterances.
To manage the multi-mapping relations prevailed in human conversation, we augment contrastive dialogue learning with group-wise dual sampling.
arXiv Detail & Related papers (2020-09-16T08:28:30Z) - Counterfactual Off-Policy Training for Neural Response Generation [94.76649147381232]
We propose to explore potential responses by counterfactual reasoning.
Training on the counterfactual responses under the adversarial learning framework helps to explore the high-reward area of the potential response space.
An empirical study on the DailyDialog dataset shows that our approach significantly outperforms the HRED model.
arXiv Detail & Related papers (2020-04-29T22:46:28Z) - Adaptive Offline Quintuplet Loss for Image-Text Matching [102.50814151323965]
Existing image-text matching approaches typically leverage triplet loss with online hard negatives to train the model.
We propose solutions by sampling negatives offline from the whole training set.
We evaluate the proposed training approach on three state-of-the-art image-text models on the MS-COCO and Flickr30K datasets.
arXiv Detail & Related papers (2020-03-07T22:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.