Crafting a Good Prompt or Providing Exemplary Dialogues? A Study of
In-Context Learning for Persona-based Dialogue Generation
- URL: http://arxiv.org/abs/2402.09954v2
- Date: Sat, 17 Feb 2024 06:11:58 GMT
- Title: Crafting a Good Prompt or Providing Exemplary Dialogues? A Study of
In-Context Learning for Persona-based Dialogue Generation
- Authors: Jiashu Pu, Yajing Wan, Yuru Zhang, Jing Chen, Ling Cheng, Qian Shao,
Yongzhu Chang, Tangjie Lv, Rongsheng Zhang
- Abstract summary: We systematically investigate the ICL capabilities of large language models (LLMs) in persona-based dialogue generation.
From experimental results, we draw three conclusions: 1) adjusting prompt instructions is the most direct, effective, and economical way to improve generation quality; 2) randomly retrieving demonstrations (demos) achieves the best results; and 3) even when we destroy the multi-turn associations and single-turn semantics in the demos, increasing the number of demos still improves dialogue performance.
- Score: 15.143135611057309
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous in-context learning (ICL) research has focused on tasks such as
classification, machine translation, text2table, etc., while studies on whether
ICL can improve human-like dialogue generation are scarce. Our work fills this
gap by systematically investigating the ICL capabilities of large language
models (LLMs) in persona-based dialogue generation, conducting extensive
experiments on high-quality real human Chinese dialogue datasets. From
experimental results, we draw three conclusions: 1) adjusting prompt
instructions is the most direct, effective, and economical way to improve
generation quality; 2) randomly retrieving demonstrations (demos) achieves the
best results, possibly due to the greater diversity and the amount of effective
information; counter-intuitively, retrieving demos with a context identical to
the query performs the worst; 3) even when we destroy the multi-turn
associations and single-turn semantics in the demos, increasing the number of
demos still improves dialogue performance, proving that LLMs can learn from
corrupted dialogue demos. Previous explanations of the ICL mechanism, such as
$n$-gram induction head, cannot fully account for this phenomenon.
Related papers
- Exploring Knowledge Tracing in Tutor-Student Dialogues [53.52699766206808]
We present a first attempt at performing knowledge tracing (KT) in tutor-student dialogues.
We propose methods to identify the knowledge components/skills involved in each dialogue turn.
We then apply a range of KT methods on the resulting labeled data to track student knowledge levels over an entire dialogue.
arXiv Detail & Related papers (2024-09-24T22:31:39Z) - Reasoning in Conversation: Solving Subjective Tasks through Dialogue
Simulation for Large Language Models [56.93074140619464]
We propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation.
The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales.
We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks.
arXiv Detail & Related papers (2024-02-27T05:37:10Z) - Self-Explanation Prompting Improves Dialogue Understanding in Large
Language Models [52.24756457516834]
We propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of Large Language Models (LLMs)
This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks.
Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts.
arXiv Detail & Related papers (2023-09-22T15:41:34Z) - Frugal Prompting for Dialog Models [17.048111072193933]
This study examines different approaches for building dialog systems using large language models (LLMs)
As part of prompt tuning, we experiment with various ways of providing instructions, exemplars, current query and additional context.
The research also analyzes the representations of dialog history that have the optimal usable-information density.
arXiv Detail & Related papers (2023-05-24T09:06:49Z) - PK-ICR: Persona-Knowledge Interactive Context Retrieval for Grounded Dialogue [21.266410719325208]
Persona and Knowledge Dual Context Identification is a task to identify persona and knowledge jointly for a given dialogue.
We develop a novel grounding retrieval method that utilizes all contexts of dialogue simultaneously.
arXiv Detail & Related papers (2023-02-13T20:27:26Z) - A Mixture-of-Expert Approach to RL-based Dialogue Management [56.08449336469477]
We use reinforcement learning to develop a dialogue agent that avoids being short-sighted (outputting generic utterances) and maximizes overall user satisfaction.
Most existing RL approaches to DM train the agent at the word-level, and thus, have to deal with aly complex action space even for a medium-size vocabulary.
We develop a RL-based DM using a novel mixture of expert language model (MoE-LM) that consists of (i) a LM capable of learning diverse semantics for conversation histories, (ii) a number of specialized LMs (or experts) capable of generating utterances corresponding to a
arXiv Detail & Related papers (2022-05-31T19:00:41Z) - Learning Dialogue Representations from Consecutive Utterances [29.150589618130695]
We introduce Dialogue Sentence Embedding (DSE), a self-supervised contrastive learning method.
DSE learns from dialogues by taking consecutive utterances of the same dialogue as positive pairs for contrastive learning.
We evaluate DSE on five downstream dialogue tasks that examine dialogue representation at different semantic granularities.
arXiv Detail & Related papers (2022-05-26T18:15:13Z) - Structural Pre-training for Dialogue Comprehension [51.215629336320305]
We present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive features.
To simulate the dialogue-like features, we propose two training objectives in addition to the original LM objectives.
Experimental results on widely used dialogue benchmarks verify the effectiveness of the newly introduced self-supervised tasks.
arXiv Detail & Related papers (2021-05-23T15:16:54Z) - Ranking Enhanced Dialogue Generation [77.8321855074999]
How to effectively utilize the dialogue history is a crucial problem in multi-turn dialogue generation.
Previous works usually employ various neural network architectures to model the history.
This paper proposes a Ranking Enhanced Dialogue generation framework.
arXiv Detail & Related papers (2020-08-13T01:49:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.