Bottom-Up Synthesis of Knowledge-Grounded Task-Oriented Dialogues with Iteratively Self-Refined Prompts
- URL: http://arxiv.org/abs/2504.14375v1
- Date: Sat, 19 Apr 2025 18:25:53 GMT
- Title: Bottom-Up Synthesis of Knowledge-Grounded Task-Oriented Dialogues with Iteratively Self-Refined Prompts
- Authors: Kun Qian, Maximillian Chen, Siyan Li, Arpit Sharma, Zhou Yu,
- Abstract summary: We introduce a bottom-up conversation synthesis approach, where QA pairs are generated first and then combined into a coherent dialogue.<n>This structure allows the use of non-local models in stages that do not involve proprietary knowledge.<n>Both human and automated evaluations demonstrate that our approach produces more realistic and higher-quality dialogues.
- Score: 19.73376945990922
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training conversational question-answering (QA) systems requires a substantial amount of in-domain data, which is often scarce in practice. A common solution to this challenge is to generate synthetic data. Traditional methods typically follow a top-down approach, where a large language model (LLM) generates multi-turn dialogues from a broad prompt. Although this method produces coherent conversations, it offers limited fine-grained control over the content and is susceptible to hallucinations. We introduce a bottom-up conversation synthesis approach, where QA pairs are generated first and then combined into a coherent dialogue. This method offers greater control and precision by dividing the process into two distinct steps, allowing refined instructions and validations to be handled separately. Additionally, this structure allows the use of non-local models in stages that do not involve proprietary knowledge, enhancing the overall quality of the generated data. Both human and automated evaluations demonstrate that our approach produces more realistic and higher-quality dialogues compared to top-down methods.
Related papers
- Generative Prompt Internalization [48.91617280112579]
We propose Generative Prompt Internalization (GenPI), a lightweight method that employs a joint training approach.<n>GenPI not only replicates the behavior of models with prompt inputs but also generates the content of the prompt.<n>We demonstrate that our approach effectively internalizes complex prompts across various agent-based application scenarios.
arXiv Detail & Related papers (2024-11-24T17:32:20Z) - Unsupervised Extraction of Dialogue Policies from Conversations [3.102576158218633]
We show how Large Language Models can be instrumental in extracting dialogue policies from datasets.
We then propose a novel method for generating dialogue policies utilizing a controllable and interpretable graph-based methodology.
arXiv Detail & Related papers (2024-06-21T14:57:25Z) - PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded
Dialogue Systems [59.1250765143521]
Current knowledge-grounded dialogue systems often fail to align the generated responses with human-preferred qualities.
We propose Polished & Informed Candidate Scoring (PICK), a generation re-scoring framework.
We demonstrate the effectiveness of PICK in generating responses that are more faithful while keeping them relevant to the dialogue history.
arXiv Detail & Related papers (2023-09-19T08:27:09Z) - Attribute Controlled Dialogue Prompting [31.09791656949115]
We present a novel, instance-specific prompt-tuning algorithm for dialogue generation.
Our method is superior to prompting baselines and comparable to fine-tuning with only 5%-6% of total parameters.
arXiv Detail & Related papers (2023-07-11T12:48:55Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - GRASP: Guiding model with RelAtional Semantics using Prompt [3.1275060062551208]
We propose a Guiding model with RelAtional Semantics using Prompt (GRASP)
We adopt a prompt-based fine-tuning approach and capture relational semantic clues of a given dialogue with an argument-aware prompt marker strategy.
In the experiments, GRASP state-of-the-art performance in terms of both F1 and F1c scores on a DialogRE dataset.
arXiv Detail & Related papers (2022-08-26T08:19:28Z) - Achieving Conversational Goals with Unsupervised Post-hoc Knowledge
Injection [37.15893335147598]
A limitation of current neural dialog models is that they tend to suffer from a lack of specificity and informativeness in generated responses.
We propose a post-hoc knowledge-injection technique where we first retrieve a diverse set of relevant knowledge snippets conditioned on both the dialog history and an initial response from an existing dialog model.
We construct multiple candidate responses, individually injecting each retrieved snippet into the initial response using a gradient-based decoding method, and then select the final response with an unsupervised ranking step.
arXiv Detail & Related papers (2022-03-22T00:42:27Z) - Response Generation with Context-Aware Prompt Learning [19.340498579331555]
We present a novel approach for pre-trained dialogue modeling that casts the dialogue generation problem as a prompt-learning task.
Instead of fine-tuning on limited dialogue data, our approach, DialogPrompt, learns continuous prompt embeddings optimized for dialogue contexts.
Our approach significantly outperforms the fine-tuning baseline and the generic prompt-learning methods.
arXiv Detail & Related papers (2021-11-04T05:40:13Z) - Smoothing Dialogue States for Open Conversational Machine Reading [70.83783364292438]
We propose an effective gating strategy by smoothing the two dialogue states in only one decoder and bridge decision making and question generation.
Experiments on the OR-ShARC dataset show the effectiveness of our method, which achieves new state-of-the-art results.
arXiv Detail & Related papers (2021-08-28T08:04:28Z) - Improving Response Quality with Backward Reasoning in Open-domain
Dialogue Systems [53.160025961101354]
We propose to train the generation model in a bidirectional manner by adding a backward reasoning step to the vanilla encoder-decoder training.
The proposed backward reasoning step pushes the model to produce more informative and coherent content.
Our method can improve response quality without introducing side information.
arXiv Detail & Related papers (2021-04-30T20:38:27Z) - Plug-and-Play Conversational Models [62.77150879036442]
We introduce an approach that does not require further computation at decoding time, while also does not require any fine-tuning of a large language model.
We demonstrate, through extensive automatic and human evaluation, a high degree of control over the generated conversational responses with regard to multiple desired attributes.
arXiv Detail & Related papers (2020-10-09T03:17:51Z) - Dialogue Distillation: Open-Domain Dialogue Augmentation Using Unpaired
Data [61.71319905364992]
We propose a novel data augmentation method for training open-domain dialogue models by utilizing unpaired data.
A data-level distillation process is first proposed to construct augmented dialogues where both post and response are retrieved from the unpaired data.
A ranking module is employed to filter out low-quality dialogues.
A model-level distillation process is employed to distill a teacher model trained on high-quality paired data to augmented dialogue pairs.
arXiv Detail & Related papers (2020-09-20T13:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.