Learning From Free-Text Human Feedback -- Collect New Datasets Or Extend
Existing Ones?
- URL: http://arxiv.org/abs/2310.15758v1
- Date: Tue, 24 Oct 2023 12:01:11 GMT
- Title: Learning From Free-Text Human Feedback -- Collect New Datasets Or Extend
Existing Ones?
- Authors: Dominic Petrak, Nafise Sadat Moosavi, Ye Tian, Nikolai Rozanov, Iryna
Gurevych
- Abstract summary: We investigate the types and frequency of free-text human feedback in commonly used dialog datasets.
Our findings provide new insights into the composition of the datasets examined, including error types, user response types, and the relations between them.
- Score: 57.16050211534735
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Learning from free-text human feedback is essential for dialog systems, but
annotated data is scarce and usually covers only a small fraction of error
types known in conversational AI. Instead of collecting and annotating new
datasets from scratch, recent advances in synthetic dialog generation could be
used to augment existing dialog datasets with the necessary annotations.
However, to assess the feasibility of such an effort, it is important to know
the types and frequency of free-text human feedback included in these datasets.
In this work, we investigate this question for a variety of commonly used
dialog datasets, including MultiWoZ, SGD, BABI, PersonaChat,
Wizards-of-Wikipedia, and the human-bot split of the Self-Feeding Chatbot.
Using our observations, we derive new taxonomies for the annotation of
free-text human feedback in dialogs and investigate the impact of including
such data in response generation for three SOTA language generation models,
including GPT-2, LLAMA, and Flan-T5. Our findings provide new insights into the
composition of the datasets examined, including error types, user response
types, and the relations between them.
Related papers
- SYNDICOM: Improving Conversational Commonsense with Error-Injection and
Natural Language Feedback [3.642278451851518]
We introduce SYNDICOM - a method for improving commonsense in dialogue response generation.
The first component is a dataset composed of commonsense dialogues created from a knowledge graph and synthesized into natural language.
The second contribution is a two-step procedure: training a model to predict natural language feedback (NLF) for invalid responses, and then training a response generation model conditioned on the predicted NLF.
arXiv Detail & Related papers (2023-09-18T15:08:48Z) - Does Collaborative Human-LM Dialogue Generation Help Information
Extraction from Human Dialogues? [55.28340832822234]
Problem-solving human dialogues in real applications can be much more complex than existing Wizard-of-Oz collections.
We introduce a human-in-the-loop dialogue generation framework capable of synthesizing realistic dialogues.
arXiv Detail & Related papers (2023-07-13T20:02:50Z) - q2d: Turning Questions into Dialogs to Teach Models How to Search [11.421839177607147]
We propose q2d: an automatic data generation pipeline that generates information-seeking dialogs from questions.
Unlike previous approaches which relied on human written dialogs with search queries, our method allows to automatically generate query-based grounded dialogs with better control and scale.
arXiv Detail & Related papers (2023-04-27T16:39:15Z) - PRESTO: A Multilingual Dataset for Parsing Realistic Task-Oriented
Dialogs [39.58414649004708]
PRESTO is a dataset of over 550K contextual multilingual conversations between humans and virtual assistants.
It contains challenges that occur in real-world NLU tasks such as disfluencies, code-switching, and revisions.
Our mT5 model based baselines demonstrate that the conversational phenomenon present in PRESTO are challenging to model.
arXiv Detail & Related papers (2023-03-15T21:51:13Z) - Controllable Dialogue Simulation with In-Context Learning [39.04491297557292]
textscDialogic is a dialogue simulation method based on large language model in-context learning.
Our method can rapidly expand a small set of dialogue data with minimum or zero human involvement.
Our simulated dialogues have near-human fluency and annotation accuracy.
arXiv Detail & Related papers (2022-10-09T06:32:58Z) - Unsupervised Neural Stylistic Text Generation using Transfer learning
and Adapters [66.17039929803933]
We propose a novel transfer learning framework which updates only $0.3%$ of model parameters to learn style specific attributes for response generation.
We learn style specific attributes from the PERSONALITY-CAPTIONS dataset.
arXiv Detail & Related papers (2022-10-07T00:09:22Z) - DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for
Dialog Response Generation [80.45816053153722]
DialogVED introduces continuous latent variables into the enhanced encoder-decoder pre-training framework to increase the relevance and diversity of responses.
We conduct experiments on PersonaChat, DailyDialog, and DSTC7-AVSD benchmarks for response generation.
arXiv Detail & Related papers (2022-04-27T16:18:15Z) - ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive
Summarization with Argument Mining [61.82562838486632]
We crowdsource four new datasets on diverse online conversation forms of news comments, discussion forums, community question answering forums, and email threads.
We benchmark state-of-the-art models on our datasets and analyze characteristics associated with the data.
arXiv Detail & Related papers (2021-06-01T22:17:13Z) - Language Model as an Annotator: Exploring DialoGPT for Dialogue
Summarization [29.887562761942114]
We show how DialoGPT, a pre-trained model for conversational response generation, can be developed as an unsupervised dialogue annotator.
We apply DialoGPT to label three types of features on two dialogue summarization datasets, SAMSum and AMI, and employ pre-trained and non pre-trained models as our summarizes.
arXiv Detail & Related papers (2021-05-26T13:50:13Z) - Reasoning in Dialog: Improving Response Generation by Context Reading
Comprehension [49.92173751203827]
In multi-turn dialog, utterances do not always take the full form of sentences.
We propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question.
arXiv Detail & Related papers (2020-12-14T10:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.