Related papers: Enhancing Dialogue Generation via Multi-Level Contrastive Learning

Enhancing Dialogue Generation via Multi-Level Contrastive Learning

URL: http://arxiv.org/abs/2009.09147v2
Date: Tue, 22 Jun 2021 13:22:06 GMT
Title: Enhancing Dialogue Generation via Multi-Level Contrastive Learning
Authors: Xin Li, Piji Li, Yan Wang, Xiaojiang Liu and Wai Lam
Abstract summary: We propose a multi-level contrastive learning paradigm to model the fine-grained quality of the responses with respect to the query. A Rank-aware (RC) network is designed to construct the multi-level contrastive optimization objectives. We build a Knowledge Inference (KI) component to capture the keyword knowledge from the reference during training and exploit such information to encourage the generation of informative words.
Score: 57.005432249952406
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Most of the existing works for dialogue generation are data-driven models trained directly on corpora crawled from websites. They mainly focus on improving the model architecture to produce better responses but pay little attention to considering the quality of the training data contrastively. In this paper, we propose a multi-level contrastive learning paradigm to model the fine-grained quality of the responses with respect to the query. A Rank-aware Calibration (RC) network is designed to construct the multi-level contrastive optimization objectives. Since these objectives are calculated based on the sentence level, which may erroneously encourage/suppress the generation of uninformative/informative words. To tackle this incidental issue, on one hand, we design an exquisite token-level strategy for estimating the instance loss more accurately. On the other hand, we build a Knowledge Inference (KI) component to capture the keyword knowledge from the reference during training and exploit such information to encourage the generation of informative words. We evaluate the proposed model on a carefully annotated dialogue dataset and the results suggest that our model can generate more relevant and diverse responses compared to the baseline models.

Related papers

Debate, Reflect, and Distill: Multi-Agent Feedback with Tree-Structured Preference Optimization for Efficient Language Model Enhancement [43.532921045069365]
Large Language Models (LLMs) continue to set new standards in knowledge-intensive and complex reasoning tasks.<n>Current techniques, such as static knowledge distillation, resource-intensive reinforcement learning from human feedback, or limited self-reflection to yield substantial and lasting performance gains.<n>We present a novel Reflect and Debate (D&R) framework that orchestrates multi-turn debates between smaller models and stronger teacher models, eliciting actionable feedback.
arXiv Detail & Related papers (2025-06-04T03:52:20Z)
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking [21.23826888841565]
We present a novel approach for training small language models for reasoning-intensive document ranking. We use web data and a teacher LLM to automatically generate high-quality training examples with relevance explanations. Our model ranks third on the leaderboard while using substantially fewer parameters than other approaches.
arXiv Detail & Related papers (2025-04-04T21:27:48Z)
Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment [0.23020018305241333]
This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts. The scope of the study encompasses enhancing model performance through innovative training techniques and data augmentation strategies.
arXiv Detail & Related papers (2024-07-01T20:25:20Z)
Topic Modeling as Multi-Objective Contrastive Optimization [46.24876966674759]
Recent representation learning approaches enhance neural topic models by optimizing the weighted linear combination of the evidence lower bound (ELBO) of the log-likelihood and the contrastive learning objective that contrasts pairs of input documents. We introduce a novel contrastive learning method oriented towards sets of topic vectors to capture useful semantics that are shared among a set of input documents. Our framework consistently produces higher-performing neural topic models in terms of topic coherence, topic diversity, and downstream performance.
arXiv Detail & Related papers (2024-02-12T11:18:32Z)
Promoting Open-domain Dialogue Generation through Learning Pattern Information between Contexts and Responses [5.936682548344234]
This paper improves the quality of generated responses by learning the implicit pattern information between contexts and responses in the training samples. We also design a response-aware mechanism for mining the implicit pattern information between contexts and responses so that the generated replies are more diverse and approximate to human replies.
arXiv Detail & Related papers (2023-09-06T08:11:39Z)
RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models [57.12888828853409]
RAVEN is a model that combines retrieval-augmented masked language modeling and prefix language modeling. Fusion-in-Context Learning enables the model to leverage more in-context examples without requiring additional training. Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning.
arXiv Detail & Related papers (2023-08-15T17:59:18Z)
An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models. We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z)
Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination. We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner. Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)
Low-Resource Knowledge-Grounded Dialogue Generation [74.09352261943913]
We consider knowledge-grounded dialogue generation under a natural assumption that only limited training examples are available. We devise a disentangled response decoder in order to isolate parameters that depend on knowledge-grounded dialogues from the entire generation model. With only 1/8 training data, our model can achieve the state-of-the-art performance and generalize well on out-of-domain knowledge.
arXiv Detail & Related papers (2020-02-24T16:20:32Z)
Variational Hierarchical Dialog Autoencoder for Dialog State Tracking Data Augmentation [59.174903564894954]
In this work, we extend this approach to the task of dialog state tracking for goal-oriented dialogs. We propose the Variational Hierarchical Dialog Autoencoder (VHDA) for modeling the complete aspects of goal-oriented dialogs. Experiments on various dialog datasets show that our model improves the downstream dialog trackers' robustness via generative data augmentation.
arXiv Detail & Related papers (2020-01-23T15:34:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.