Related papers: Improving Factuality for Dialogue Response Generation via Graph-Based Knowledge Augmentation

Improving Factuality for Dialogue Response Generation via Graph-Based Knowledge Augmentation

URL: http://arxiv.org/abs/2506.12496v2
Date: Thu, 07 Aug 2025 16:23:33 GMT
Title: Improving Factuality for Dialogue Response Generation via Graph-Based Knowledge Augmentation
Authors: Xiangyan Chen, Yujian Gan, Yimeng Gu, Matthew Purver,
Abstract summary: Large Language Models (LLMs) generate plausible but inconsistent or factually incorrect text.<n>We propose two novel graph knowledge-augmented frameworks, Dialogue Response Generation via Textualised Graphs (TG-DRG) and Graph-Aware Dialogue Response Generation (GA-DRG)<n>TG-DRG combines reasoning-guided dialogue reformulation, dialogue sense knowledge selection, and graph-enhanced response generation to improve the factuality of dialogue responses.
Score: 8.423723358002539
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large Language Models (LLMs) succeed in many natural language processing tasks. However, their tendency to hallucinate - generate plausible but inconsistent or factually incorrect text - can cause significant problems in certain tasks, including response generation in dialogue. To mitigate this issue, we propose two novel graph knowledge-augmented frameworks, Dialogue Response Generation via Textualised Graphs (TG-DRG) and Graph-Aware Dialogue Response Generation (GA-DRG), which combine reasoning-guided dialogue reformulation, dialogue sense knowledge selection, and graph-enhanced response generation to improve the factuality of dialogue responses. To evaluate the factuality of generated responses, we propose a dialogue fact score that addresses the limitations of existing fact-score methods in dialogue settings, providing a more reliable assessment of factual consistency. We evaluate our methods using different baselines on the OpendialKG and HybriDialogue datasets. Our methods noticeably improve factuality compared to other graph knowledge-augmentation baselines, including the state-of-the-art G-retriever, achieving improvements of 3.47% on OpendialKG and 3.12% on HybriDialogue in terms of dialogue fact score. The code will be released on GitHub.

Related papers

FineDialFact: A benchmark for Fine-grained Dialogue Fact Verification [45.2458418225596]
Large Language Models (LLMs) are known to produce hallucinations - factually incorrect or fabricated information.<n>Current approaches to hallucination detection in dialogue systems primarily focus on verifying the factual consistency of generated responses.<n>We introduce a benchmark, FineDialFact, for fine-grained dialogue fact verification.
arXiv Detail & Related papers (2025-08-07T18:51:03Z)
PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems [59.1250765143521]
Current knowledge-grounded dialogue systems often fail to align the generated responses with human-preferred qualities. We propose Polished & Informed Candidate Scoring (PICK), a generation re-scoring framework. We demonstrate the effectiveness of PICK in generating responses that are more faithful while keeping them relevant to the dialogue history.
arXiv Detail & Related papers (2023-09-19T08:27:09Z)
Knowledge Graph-Augmented Language Models for Knowledge-Grounded Dialogue Generation [58.65698688443091]
We propose SUbgraph Retrieval-augmented GEneration (SURGE), a framework for generating context-relevant and knowledge-grounded dialogues with Knowledge Graphs (KGs) Our framework first retrieves the relevant subgraph from the KG, and then enforces consistency across facts by perturbing their word embeddings conditioned by the retrieved subgraph. We validate our SURGE framework on OpendialKG and KOMODIS datasets, showing that it generates high-quality dialogues that faithfully reflect the knowledge from KG.
arXiv Detail & Related papers (2023-05-30T08:36:45Z)
PK-Chat: Pointer Network Guided Knowledge Driven Generative Dialogue Model [79.64376762489164]
PK-Chat is a Pointer network guided generative dialogue model, incorporating a unified pretrained language model and a pointer network over knowledge graphs. The words generated by PK-Chat in the dialogue are derived from the prediction of word lists and the direct prediction of the external knowledge graph knowledge. Based on the PK-Chat, a dialogue system is built for academic scenarios in the case of geosciences.
arXiv Detail & Related papers (2023-04-02T18:23:13Z)
Graph Based Network with Contextualized Representations of Turns in Dialogue [0.0]
Dialogue-based relation extraction (RE) aims to extract relation(s) between two arguments that appear in a dialogue. We propose the TUrn COntext awaRE Graph Convolutional Network (TUCORE-GCN) modeled by paying attention to the way people understand dialogues.
arXiv Detail & Related papers (2021-09-09T03:09:08Z)
Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension [49.92173751203827]
In multi-turn dialog, utterances do not always take the full form of sentences. We propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question.
arXiv Detail & Related papers (2020-12-14T10:58:01Z)
DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances [18.199473005335093]
This paper presents DialogBERT, a novel conversational response generation model that enhances previous PLM-based dialogue models. To efficiently capture the discourse-level coherence among utterances, we propose two training objectives, including masked utterance regression. Experiments on three multi-turn conversation datasets show that our approach remarkably outperforms the baselines.
arXiv Detail & Related papers (2020-12-03T09:06:23Z)
GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems [133.13117064357425]
We propose a new evaluation metric GRADE, which stands for Graph-enhanced Representations for Automatic Dialogue Evaluation. Specifically, GRADE incorporates both coarse-grained utterance-level contextualized representations and fine-grained topic-level graph representations to evaluate dialogue coherence. Experimental results show that our GRADE significantly outperforms other state-of-the-art metrics on measuring diverse dialogue models.
arXiv Detail & Related papers (2020-10-08T14:07:32Z)
GraphDialog: Integrating Graph Knowledge into End-to-End Task-Oriented Dialogue Systems [9.560436630775762]
End-to-end task-oriented dialogue systems aim to generate system responses directly from plain text inputs. One is how to effectively incorporate external knowledge bases (KBs) into the learning framework; the other is how to accurately capture the semantics of dialogue history. We address these two challenges by exploiting the graph structural information in the knowledge base and in the dependency parsing tree of the dialogue.
arXiv Detail & Related papers (2020-10-04T00:04:40Z)
Ranking Enhanced Dialogue Generation [77.8321855074999]
How to effectively utilize the dialogue history is a crucial problem in multi-turn dialogue generation. Previous works usually employ various neural network architectures to model the history. This paper proposes a Ranking Enhanced Dialogue generation framework.
arXiv Detail & Related papers (2020-08-13T01:49:56Z)
Rethinking Dialogue State Tracking with Reasoning [76.0991910623001]
This paper proposes to track dialogue states gradually with reasoning over dialogue turns with the help of the back-end data. Empirical results demonstrate that our method significantly outperforms the state-of-the-art methods by 38.6% in terms of joint belief accuracy for MultiWOZ 2.1.
arXiv Detail & Related papers (2020-05-27T02:05:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.