RefGPT: Dialogue Generation of GPT, by GPT, and for GPT
- URL: http://arxiv.org/abs/2305.14994v3
- Date: Thu, 19 Oct 2023 00:45:51 GMT
- Title: RefGPT: Dialogue Generation of GPT, by GPT, and for GPT
- Authors: Dongjie Yang, Ruifeng Yuan, Yuantao Fan, Yifei Yang, Zili Wang, Shusen
Wang, Hai Zhao
- Abstract summary: Large Language Models (LLMs) have attained the impressive capability to resolve a wide range of NLP tasks by fine-tuning high-quality instruction data.
However, collecting human-written data of high quality, especially multi-turn dialogues, is expensive and unattainable for most people.
We propose a method called RefGPT to generate enormous truthful and customized dialogues without worrying about factual errors caused by the model hallucination.
- Score: 61.451780081612974
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have attained the impressive capability to
resolve a wide range of NLP tasks by fine-tuning high-quality instruction data.
However, collecting human-written data of high quality, especially multi-turn
dialogues, is expensive and unattainable for most people. Though previous
studies have used powerful LLMs to generate the dialogues automatically, they
all suffer from generating untruthful dialogues because of the model
hallucination. Therefore, we propose a method called RefGPT to generate
enormous truthful and customized dialogues without worrying about factual
errors caused by the model hallucination. RefGPT solves the model hallucination
in dialogue generation by restricting the LLMs to leverage the given reference
instead of reciting their own knowledge to generate dialogues. Additionally,
RefGPT adds detailed controls on every utterance to enable high customization
capability, which previous studies have ignored. On the basis of RefGPT, we
also propose two high-quality dialogue datasets generated by GPT-4, namely
RefGPT-Fact and RefGPT-Code. RefGPT-Fact is a dataset with 100k multi-turn
dialogues based on factual knowledge and RefGPT-Code has 76k multi-turn
dialogues covering a wide range of coding scenarios. Our code and datasets are
released in https://github.com/mutonix/RefGPT.
Related papers
- FineDialFact: A benchmark for Fine-grained Dialogue Fact Verification [45.2458418225596]
Large Language Models (LLMs) are known to produce hallucinations - factually incorrect or fabricated information.<n>Current approaches to hallucination detection in dialogue systems primarily focus on verifying the factual consistency of generated responses.<n>We introduce a benchmark, FineDialFact, for fine-grained dialogue fact verification.
arXiv Detail & Related papers (2025-08-07T18:51:03Z) - Enhancing the Preference Extractor in Multi-turn Dialogues: From Annotating Disasters to Accurate Preference Extraction [11.102491100383254]
We propose a novel dialogue data generation framework named textbfIterChat.<n>First, we construct a new data format that categorizes the dialogue data into attributed historical preferences and one-turn dialogues.<n>This reduces the probability of annotation errors and improves annotation efficiency.
arXiv Detail & Related papers (2025-08-03T12:44:03Z) - Bottom-Up Synthesis of Knowledge-Grounded Task-Oriented Dialogues with Iteratively Self-Refined Prompts [19.73376945990922]
We introduce a bottom-up conversation synthesis approach, where QA pairs are generated first and then combined into a coherent dialogue.
This structure allows the use of non-local models in stages that do not involve proprietary knowledge.
Both human and automated evaluations demonstrate that our approach produces more realistic and higher-quality dialogues.
arXiv Detail & Related papers (2025-04-19T18:25:53Z) - CoPrUS: Consistency Preserving Utterance Synthesis towards more realistic benchmark dialogues [0.27309692684728604]
We investigate the creation of synthetic communication errors in an automatic pipeline.
We focus on three types of miscommunications that could happen in real-world dialogues but are underrepresented in the benchmark dataset.
Our two-step approach uses a state-of-the-art Large Language Model (LLM) to first create the error and secondly the repairing utterance.
arXiv Detail & Related papers (2024-12-10T13:51:55Z) - Synthetic Dialogue Dataset Generation using LLM Agents [7.933485970511388]
We develop two agents that "talk" to each other, one acting as the conversational agent, and the other acting as the user.
Using a set of text descriptions of linear problems from NL4Opt available to the user only, the agent and the user engage in conversation until the agent has retrieved all key information from the original problem description.
We conduct human and automatic evaluations, including an evaluation approach that uses GPT-4 to mimic the human evaluation metrics.
arXiv Detail & Related papers (2024-01-30T21:49:30Z) - Are LLMs Robust for Spoken Dialogues? [10.855403629160921]
Large Pre-Trained Language Models have demonstrated state-of-the-art performance in different downstream tasks.
Most of the publicly available datasets and benchmarks on task-oriented dialogues focus on written conversations.
We have evaluated the performance of LLMs for spoken task-oriented dialogues on the DSTC11 test sets.
arXiv Detail & Related papers (2024-01-04T14:36:38Z) - BotChat: Evaluating LLMs' Capabilities of Having Multi-Turn Dialogues [72.65163468440434]
This report provides a preliminary evaluation of existing large language models for human-style multi-turn chatting.
We prompt large language models (LLMs) to generate a full multi-turn dialogue based on the ChatSEED, utterance by utterance.
We find GPT-4 can generate human-style multi-turn dialogues with impressive quality, significantly outperforms its counterparts.
arXiv Detail & Related papers (2023-10-20T16:53:51Z) - DialogStudio: Towards Richest and Most Diverse Unified Dataset
Collection for Conversational AI [92.29874802394167]
DialogStudio is the largest and most diverse collection of dialogue datasets.
Our collection encompasses data from open-domain dialogues, task-oriented dialogues, natural language understanding, conversational recommendation, dialogue summarization, and knowledge-grounded dialogues.
arXiv Detail & Related papers (2023-07-19T17:57:53Z) - SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation [55.82577086422923]
We provide a feasible definition of dialogue segmentation points with the help of document-grounded dialogues.
We release a large-scale supervised dataset called SuperDialseg, containing 9,478 dialogues.
We also provide a benchmark including 18 models across five categories for the dialogue segmentation task.
arXiv Detail & Related papers (2023-05-15T06:08:01Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z) - Large Language Models Meet Harry Potter: A Bilingual Dataset for
Aligning Dialogue Agents with Characters [70.84938803753062]
We introduce the Harry Potter Dialogue dataset, designed to advance the study of dialogue agents and character alignment.
The dataset encompasses all dialogue sessions (in both English and Chinese) from the Harry Potter series.
It is annotated with vital background information, including dialogue scenes, speakers, character relationships, and attributes.
arXiv Detail & Related papers (2022-11-13T10:16:39Z) - What Did You Say? Task-Oriented Dialog Datasets Are Not Conversational!? [4.022057598291766]
We outline a taxonomy of conversational and contextual effects, which we use to examine MultiWOZ, SGD and SMCalFlow.
We find that less than 4% of MultiWOZ's turns and 10% of SGD's turns are conversational, while SMCalFlow is not conversational at all in its current release.
arXiv Detail & Related papers (2022-03-07T14:26:23Z) - Language Model as an Annotator: Exploring DialoGPT for Dialogue
Summarization [29.887562761942114]
We show how DialoGPT, a pre-trained model for conversational response generation, can be developed as an unsupervised dialogue annotator.
We apply DialoGPT to label three types of features on two dialogue summarization datasets, SAMSum and AMI, and employ pre-trained and non pre-trained models as our summarizes.
arXiv Detail & Related papers (2021-05-26T13:50:13Z) - Paraphrase Augmented Task-Oriented Dialog Generation [68.1790912977053]
We propose a paraphrase augmented response generation (PARG) framework that jointly trains a paraphrase model and a response generation model.
We also design a method to automatically construct paraphrase training data set based on dialog state and dialog act labels.
arXiv Detail & Related papers (2020-04-16T05:12:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.