Related papers: CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models

CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models

URL: http://arxiv.org/abs/2205.10558v3
Date: Sat, 20 May 2023 13:50:54 GMT
Title: CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models
Authors: Bishal Santra, Ravi Ghadia, Manish Gupta and Pawan Goyal
Abstract summary: CORAL is a novel loss function based on a reinforcement learning view of the dialog generation task. It estimates human preference for generated responses while considering both the context and the response. To overcome challenges such as high sample complexity of RL training and a large action space, we propose a mix-policy training algorithm.
Score: 12.654742638172307
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In the field of Natural Language Processing, there are many tasks that can be tackled effectively using the cross-entropy (CE) loss function. However, the task of dialog generation poses unique challenges for CE loss. This is because CE loss assumes that, for any given input, the only possible output is the one available as the ground truth in the training dataset. But, in dialog generation, there can be multiple valid responses (for a given context) that not only have different surface forms but can also be semantically different. Furthermore, CE loss computation for the dialog generation task does not take the input context into consideration and, hence, it grades the response irrespective of the context. To grade the generated response for qualities like relevance, engagingness, etc., the loss function should depend on both the context and the generated response. To address these limitations, this paper proposes CORAL, a novel loss function based on a reinforcement learning (RL) view of the dialog generation task with a reward function that estimates human preference for generated responses while considering both the context and the response. Furthermore, to overcome challenges such as high sample complexity of RL training and a large action space, we propose a mix-policy training algorithm. Notably, using CORAL we can train dialog generation models without assuming the ground-truth as the only correct response. Extensive comparisons on benchmark datasets demonstrate that CORAL based models outperform strong state-of-the-art baseline models of different sizes.

Related papers

Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models [56.93074140619464]
We propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation. The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales. We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks.
arXiv Detail & Related papers (2024-02-27T05:37:10Z)
Making Retrieval-Augmented Language Models Robust to Irrelevant Context [55.564789967211844]
An important desideratum of RALMs, is that retrieved information helps model performance when it is relevant. Recent work has shown that retrieval augmentation can sometimes have a negative effect on performance.
arXiv Detail & Related papers (2023-10-02T18:52:35Z)
A Systematic Evaluation of Response Selection for Open Domain Dialogue [36.88551817451512]
We curated a dataset where responses from multiple response generators produced for the same dialog context are manually annotated as appropriate (positive) and inappropriate (negative) We conduct a systematic evaluation of state-of-the-art methods for response selection, and demonstrate that both strategies of using multiple positive candidates and using manually verified hard negative candidates can bring in significant performance improvement in comparison to using the adversarial training data, e.g., increase of 3% and 13% in Recall@1 score, respectively.
arXiv Detail & Related papers (2022-08-08T19:33:30Z)
CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning [85.3987745097806]
offline reinforcement learning can be used to train dialogue agents entirely using static datasets collected from human speakers. Experiments show that recently developed offline RL methods can be combined with language models to yield realistic dialogue agents.
arXiv Detail & Related papers (2022-04-18T17:43:21Z)
Automatically Generating Counterfactuals for Relation Exaction [18.740447044960796]
relation extraction (RE) is a fundamental task in natural language processing. Current deep neural models have achieved high accuracy but are easily affected by spurious correlations. We develop a novel approach to derive contextual counterfactuals for entities.
arXiv Detail & Related papers (2022-02-22T04:46:10Z)
Generating Dialogue Responses from a Semantic Latent Space [75.18449428414736]
We propose an alternative to the end-to-end classification on vocabulary. We learn the pair relationship between the prompts and responses as a regression task on a latent space. Human evaluation showed that learning the task on a continuous space can generate responses that are both relevant and informative.
arXiv Detail & Related papers (2020-10-04T19:06:16Z)
Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination. We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner. Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)
The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection [55.390442067381755]
We show that grayscale data can be automatically constructed without human effort. Our method employs off-the-shelf response retrieval models and response generation models as automatic grayscale data generators. Experiments on three benchmark datasets and four state-of-the-art matching models show that the proposed approach brings significant and consistent performance improvements.
arXiv Detail & Related papers (2020-04-06T06:34:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.