Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in
Dialogue Generation
- URL: http://arxiv.org/abs/2109.05487v1
- Date: Sun, 12 Sep 2021 11:13:19 GMT
- Title: Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in
Dialogue Generation
- Authors: Leyang Cui, Yu Wu, Shujie Liu, Yue Zhang
- Abstract summary: We introduce two auxiliary training objectives: 1) Interpret Masked Word, which conjectures the meaning of the masked entity given the context; 2) Hypernym Generation, which predicts the hypernym of the entity based on the context.
Experiment results on two dialogue corpus verify the effectiveness of our methods under both knowledge available and unavailable settings.
- Score: 33.806361531386685
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although pre-training models have achieved great success in dialogue
generation, their performance drops dramatically when the input contains an
entity that does not appear in pre-training and fine-tuning datasets (unseen
entity). To address this issue, existing methods leverage an external knowledge
base to generate appropriate responses. In real-world scenario, the entity may
not be included by the knowledge base or suffer from the precision of knowledge
retrieval. To deal with this problem, instead of introducing knowledge base as
the input, we force the model to learn a better semantic representation by
predicting the information in the knowledge base, only based on the input
context. Specifically, with the help of a knowledge base, we introduce two
auxiliary training objectives: 1) Interpret Masked Word, which conjectures the
meaning of the masked entity given the context; 2) Hypernym Generation, which
predicts the hypernym of the entity based on the context. Experiment results on
two dialogue corpus verify the effectiveness of our methods under both
knowledge available and unavailable settings.
Related papers
- Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance.
We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z) - Enhancing Contextual Understanding in Large Language Models through Contrastive Decoding [9.2433070542025]
Large language models (LLMs) tend to inadequately integrate input context during text generation.
We introduce a novel approach integrating contrastive decoding with adversarial irrelevant passages as negative samples.
arXiv Detail & Related papers (2024-05-04T20:38:41Z) - Blending Reward Functions via Few Expert Demonstrations for Faithful and
Accurate Knowledge-Grounded Dialogue Generation [22.38338205905379]
We leverage reinforcement learning algorithms to overcome the above challenges by introducing a novel reward function.
Our reward function combines an accuracy metric and a faithfulness metric to provide a balanced quality judgment of generated responses.
arXiv Detail & Related papers (2023-11-02T02:42:41Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Textual Explanations and Critiques in Recommendation Systems [8.406549970145846]
dissertation focuses on two fundamental challenges of addressing this need.
The first involves explanation generation in a scalable and data-driven manner.
The second challenge consists in making explanations actionable, and we refer to it as critiquing.
arXiv Detail & Related papers (2022-05-15T11:59:23Z) - Towards a Flexible Embedding Learning Framework [15.604564543883122]
We propose an embedding learning framework that is flexible in terms of the relationships that can be embedded into the learned representations.
A sampling mechanism is carefully designed to establish a direct connection between the input and the information captured by the output embeddings.
Our empirical results demonstrate that the proposed framework, in conjunction with a set of relevant entity-relation-matrices, outperforms the existing state-of-the-art approaches in various data mining tasks.
arXiv Detail & Related papers (2020-09-23T08:00:56Z) - Improving Machine Reading Comprehension with Contextualized Commonsense
Knowledge [62.46091695615262]
We aim to extract commonsense knowledge to improve machine reading comprehension.
We propose to represent relations implicitly by situating structured knowledge in a context.
We employ a teacher-student paradigm to inject multiple types of contextualized knowledge into a student machine reader.
arXiv Detail & Related papers (2020-09-12T17:20:01Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z) - Low-Resource Knowledge-Grounded Dialogue Generation [74.09352261943913]
We consider knowledge-grounded dialogue generation under a natural assumption that only limited training examples are available.
We devise a disentangled response decoder in order to isolate parameters that depend on knowledge-grounded dialogues from the entire generation model.
With only 1/8 training data, our model can achieve the state-of-the-art performance and generalize well on out-of-domain knowledge.
arXiv Detail & Related papers (2020-02-24T16:20:32Z) - Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue [51.513276162736844]
We propose a sequential latent variable model as the first approach to this matter.
The model named sequential knowledge transformer (SKT) can keep track of the prior and posterior distribution over knowledge.
arXiv Detail & Related papers (2020-02-18T11:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.