CORAL: Contextual Response Retrievability Loss Function for Training
Dialog Generation Models
- URL: http://arxiv.org/abs/2205.10558v3
- Date: Sat, 20 May 2023 13:50:54 GMT
- Title: CORAL: Contextual Response Retrievability Loss Function for Training
Dialog Generation Models
- Authors: Bishal Santra, Ravi Ghadia, Manish Gupta and Pawan Goyal
- Abstract summary: CORAL is a novel loss function based on a reinforcement learning view of the dialog generation task.
It estimates human preference for generated responses while considering both the context and the response.
To overcome challenges such as high sample complexity of RL training and a large action space, we propose a mix-policy training algorithm.
- Score: 12.654742638172307
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the field of Natural Language Processing, there are many tasks that can be
tackled effectively using the cross-entropy (CE) loss function. However, the
task of dialog generation poses unique challenges for CE loss. This is because
CE loss assumes that, for any given input, the only possible output is the one
available as the ground truth in the training dataset. But, in dialog
generation, there can be multiple valid responses (for a given context) that
not only have different surface forms but can also be semantically different.
Furthermore, CE loss computation for the dialog generation task does not take
the input context into consideration and, hence, it grades the response
irrespective of the context. To grade the generated response for qualities like
relevance, engagingness, etc., the loss function should depend on both the
context and the generated response. To address these limitations, this paper
proposes CORAL, a novel loss function based on a reinforcement learning (RL)
view of the dialog generation task with a reward function that estimates human
preference for generated responses while considering both the context and the
response. Furthermore, to overcome challenges such as high sample complexity of
RL training and a large action space, we propose a mix-policy training
algorithm. Notably, using CORAL we can train dialog generation models without
assuming the ground-truth as the only correct response. Extensive comparisons
on benchmark datasets demonstrate that CORAL based models outperform strong
state-of-the-art baseline models of different sizes.
Related papers
- Reasoning in Conversation: Solving Subjective Tasks through Dialogue
Simulation for Large Language Models [56.93074140619464]
We propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation.
The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales.
We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks.
arXiv Detail & Related papers (2024-02-27T05:37:10Z) - Making Retrieval-Augmented Language Models Robust to Irrelevant Context [55.564789967211844]
An important desideratum of RALMs, is that retrieved information helps model performance when it is relevant.
Recent work has shown that retrieval augmentation can sometimes have a negative effect on performance.
arXiv Detail & Related papers (2023-10-02T18:52:35Z) - A Systematic Evaluation of Response Selection for Open Domain Dialogue [36.88551817451512]
We curated a dataset where responses from multiple response generators produced for the same dialog context are manually annotated as appropriate (positive) and inappropriate (negative)
We conduct a systematic evaluation of state-of-the-art methods for response selection, and demonstrate that both strategies of using multiple positive candidates and using manually verified hard negative candidates can bring in significant performance improvement in comparison to using the adversarial training data, e.g., increase of 3% and 13% in Recall@1 score, respectively.
arXiv Detail & Related papers (2022-08-08T19:33:30Z) - CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement
Learning [85.3987745097806]
offline reinforcement learning can be used to train dialogue agents entirely using static datasets collected from human speakers.
Experiments show that recently developed offline RL methods can be combined with language models to yield realistic dialogue agents.
arXiv Detail & Related papers (2022-04-18T17:43:21Z) - Automatically Generating Counterfactuals for Relation Exaction [18.740447044960796]
relation extraction (RE) is a fundamental task in natural language processing.
Current deep neural models have achieved high accuracy but are easily affected by spurious correlations.
We develop a novel approach to derive contextual counterfactuals for entities.
arXiv Detail & Related papers (2022-02-22T04:46:10Z) - Generating Dialogue Responses from a Semantic Latent Space [75.18449428414736]
We propose an alternative to the end-to-end classification on vocabulary.
We learn the pair relationship between the prompts and responses as a regression task on a latent space.
Human evaluation showed that learning the task on a continuous space can generate responses that are both relevant and informative.
arXiv Detail & Related papers (2020-10-04T19:06:16Z) - Learning an Effective Context-Response Matching Model with
Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination.
We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner.
Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z) - The World is Not Binary: Learning to Rank with Grayscale Data for
Dialogue Response Selection [55.390442067381755]
We show that grayscale data can be automatically constructed without human effort.
Our method employs off-the-shelf response retrieval models and response generation models as automatic grayscale data generators.
Experiments on three benchmark datasets and four state-of-the-art matching models show that the proposed approach brings significant and consistent performance improvements.
arXiv Detail & Related papers (2020-04-06T06:34:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.