Related papers: Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in Dialogue Generation

Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in Dialogue Generation

URL: http://arxiv.org/abs/2109.05487v1
Date: Sun, 12 Sep 2021 11:13:19 GMT
Title: Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in Dialogue Generation
Authors: Leyang Cui, Yu Wu, Shujie Liu, Yue Zhang
Abstract summary: We introduce two auxiliary training objectives: 1) Interpret Masked Word, which conjectures the meaning of the masked entity given the context; 2) Hypernym Generation, which predicts the hypernym of the entity based on the context. Experiment results on two dialogue corpus verify the effectiveness of our methods under both knowledge available and unavailable settings.
Score: 33.806361531386685
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Although pre-training models have achieved great success in dialogue generation, their performance drops dramatically when the input contains an entity that does not appear in pre-training and fine-tuning datasets (unseen entity). To address this issue, existing methods leverage an external knowledge base to generate appropriate responses. In real-world scenario, the entity may not be included by the knowledge base or suffer from the precision of knowledge retrieval. To deal with this problem, instead of introducing knowledge base as the input, we force the model to learn a better semantic representation by predicting the information in the knowledge base, only based on the input context. Specifically, with the help of a knowledge base, we introduce two auxiliary training objectives: 1) Interpret Masked Word, which conjectures the meaning of the masked entity given the context; 2) Hypernym Generation, which predicts the hypernym of the entity based on the context. Experiment results on two dialogue corpus verify the effectiveness of our methods under both knowledge available and unavailable settings.

Related papers

Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance. We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z)
Enhancing Contextual Understanding in Large Language Models through Contrastive Decoding [9.2433070542025]
Large language models (LLMs) tend to inadequately integrate input context during text generation. We introduce a novel approach integrating contrastive decoding with adversarial irrelevant passages as negative samples.
arXiv Detail & Related papers (2024-05-04T20:38:41Z)
Blending Reward Functions via Few Expert Demonstrations for Faithful and Accurate Knowledge-Grounded Dialogue Generation [22.38338205905379]
We leverage reinforcement learning algorithms to overcome the above challenges by introducing a novel reward function. Our reward function combines an accuracy metric and a faithfulness metric to provide a balanced quality judgment of generated responses.
arXiv Detail & Related papers (2023-11-02T02:42:41Z)
DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge. Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z)
KPT: Keyword-guided Pre-training for Grounded Dialog Generation [82.68787152707455]
We propose KPT (guided Pre-Training), a novel self-supervised pre-training method for grounded dialog generation. Specifically, we use a pre-trained language model to extract the most uncertain tokens in the dialog as keywords. We conduct extensive experiments on various few-shot knowledge-grounded generation tasks, including grounding on dialog acts, knowledge graphs, persona descriptions, and Wikipedia passages.
arXiv Detail & Related papers (2022-12-04T04:05:01Z)
Textual Explanations and Critiques in Recommendation Systems [8.406549970145846]
dissertation focuses on two fundamental challenges of addressing this need. The first involves explanation generation in a scalable and data-driven manner. The second challenge consists in making explanations actionable, and we refer to it as critiquing.
arXiv Detail & Related papers (2022-05-15T11:59:23Z)
Towards a Flexible Embedding Learning Framework [15.604564543883122]
We propose an embedding learning framework that is flexible in terms of the relationships that can be embedded into the learned representations. A sampling mechanism is carefully designed to establish a direct connection between the input and the information captured by the output embeddings. Our empirical results demonstrate that the proposed framework, in conjunction with a set of relevant entity-relation-matrices, outperforms the existing state-of-the-art approaches in various data mining tasks.
arXiv Detail & Related papers (2020-09-23T08:00:56Z)
Improving Machine Reading Comprehension with Contextualized Commonsense Knowledge [62.46091695615262]
We aim to extract commonsense knowledge to improve machine reading comprehension. We propose to represent relations implicitly by situating structured knowledge in a context. We employ a teacher-student paradigm to inject multiple types of contextualized knowledge into a student machine reader.
arXiv Detail & Related papers (2020-09-12T17:20:01Z)
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs. Building upon entity-level masked language models, our first contribution is an entity masking scheme. In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)
Low-Resource Knowledge-Grounded Dialogue Generation [74.09352261943913]
We consider knowledge-grounded dialogue generation under a natural assumption that only limited training examples are available. We devise a disentangled response decoder in order to isolate parameters that depend on knowledge-grounded dialogues from the entire generation model. With only 1/8 training data, our model can achieve the state-of-the-art performance and generalize well on out-of-domain knowledge.
arXiv Detail & Related papers (2020-02-24T16:20:32Z)
Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue [51.513276162736844]
We propose a sequential latent variable model as the first approach to this matter. The model named sequential knowledge transformer (SKT) can keep track of the prior and posterior distribution over knowledge.
arXiv Detail & Related papers (2020-02-18T11:59:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.