Improving Factual Consistency Between a Response and Persona Facts
- URL: http://arxiv.org/abs/2005.00036v2
- Date: Mon, 15 Feb 2021 04:21:13 GMT
- Title: Improving Factual Consistency Between a Response and Persona Facts
- Authors: Mohsen Mesgar, Edwin Simpson, Iryna Gurevych
- Abstract summary: Neural models for response generation produce responses that are semantically plausible but not necessarily factually consistent with facts describing the speaker's persona.
We propose to fine-tune these models by reinforcement learning and an efficient reward function that explicitly captures the consistency between a response and persona facts as well as semantic plausibility.
- Score: 64.30785349238619
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural models for response generation produce responses that are semantically
plausible but not necessarily factually consistent with facts describing the
speaker's persona. These models are trained with fully supervised learning
where the objective function barely captures factual consistency. We propose to
fine-tune these models by reinforcement learning and an efficient reward
function that explicitly captures the consistency between a response and
persona facts as well as semantic plausibility. Our automatic and human
evaluations on the PersonaChat corpus confirm that our approach increases the
rate of responses that are factually consistent with persona facts over its
supervised counterpart while retaining the language quality of responses.
Related papers
- When Large Language Models contradict humans? Large Language Models' Sycophantic Behaviour [0.8133739801185272]
We show that Large Language Models (LLMs) show sycophantic tendencies when responding to queries involving subjective opinions and statements.
LLMs at various scales seem not to follow the users' hints by demonstrating confidence in delivering the correct answers.
arXiv Detail & Related papers (2023-11-15T22:18:33Z) - Using Large Language Models to Provide Explanatory Feedback to Human
Tutors [3.2507682694499582]
We present two approaches for supplying tutors real-time feedback within an online lesson on how to give students effective praise.
This work-in-progress demonstrates considerable accuracy in binary classification for corrective feedback of effective, or effort-based.
More notably, we introduce progress towards an enhanced approach of providing explanatory feedback using large language model-facilitated named entity recognition.
arXiv Detail & Related papers (2023-06-27T14:19:12Z) - FRSUM: Towards Faithful Abstractive Summarization via Enhancing Factual
Robustness [56.263482420177915]
We study the faithfulness of existing systems from a new perspective of factual robustness.
We propose a novel training strategy, namely FRSUM, which teaches the model to defend against both explicit adversarial samples and implicit factual adversarial perturbations.
arXiv Detail & Related papers (2022-11-01T06:09:00Z) - Analyzing and Evaluating Faithfulness in Dialogue Summarization [67.07947198421421]
We first perform the fine-grained human analysis on the faithfulness of dialogue summaries and observe that over 35% of generated summaries are faithfully inconsistent respective the source dialogues.
We present a new model-level faithfulness evaluation method. It examines generation models with multi-choice questions created by rule-based transformations.
arXiv Detail & Related papers (2022-10-21T07:22:43Z) - Rome was built in 1776: A Case Study on Factual Correctness in
Knowledge-Grounded Response Generation [18.63673852470077]
We present a human annotation setup to identify three different response types.
We automatically create a new corpus called Conv-FEVER that is adapted from the Wizard of Wikipedia dataset.
arXiv Detail & Related papers (2021-10-11T17:48:11Z) - I Beg to Differ: A study of constructive disagreement in online
conversations [15.581515781839656]
We construct a corpus of 7 425 Wikipedia Talk page conversations that contain content disputes.
We define the task of predicting whether disagreements will be escalated to mediation by a moderator.
We develop a variety of neural models and show that taking into account the structure of the conversation improves predictive accuracy.
arXiv Detail & Related papers (2021-01-26T16:36:43Z) - Dialogue Response Ranking Training with Large-Scale Human Feedback Data [52.12342165926226]
We leverage social media feedback data to build a large-scale training dataset for feedback prediction.
We trained DialogRPT, a set of GPT-2 based models on 133M pairs of human feedback data.
Our ranker outperforms the conventional dialog perplexity baseline with a large margin on predicting Reddit feedback.
arXiv Detail & Related papers (2020-09-15T10:50:05Z) - A Controllable Model of Grounded Response Generation [122.7121624884747]
Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process.
We propose a framework that we call controllable grounded response generation (CGRG)
We show that using this framework, a transformer based model with a novel inductive attention mechanism, trained on a conversation-like Reddit dataset, outperforms strong generation baselines.
arXiv Detail & Related papers (2020-05-01T21:22:08Z) - You Impress Me: Dialogue Generation via Mutual Persona Perception [62.89449096369027]
The research in cognitive science suggests that understanding is an essential signal for a high-quality chit-chat conversation.
Motivated by this, we propose P2 Bot, a transmitter-receiver based framework with the aim of explicitly modeling understanding.
arXiv Detail & Related papers (2020-04-11T12:51:07Z) - Posterior-GAN: Towards Informative and Coherent Response Generation with
Posterior Generative Adversarial Network [38.576579498740244]
We propose a novel encoder-decoder based generative adversarial learning framework, Posterior Generative Adversarial Network (Posterior-GAN)
Experimental results demonstrate that our method effectively boosts the informativeness and coherence of the generated response on both automatic and human evaluation.
arXiv Detail & Related papers (2020-03-04T11:57:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.