Frugal Prompting for Dialog Models
- URL: http://arxiv.org/abs/2305.14919v2
- Date: Sun, 5 Nov 2023 06:05:19 GMT
- Title: Frugal Prompting for Dialog Models
- Authors: Bishal Santra, Sakya Basak, Abhinandan De, Manish Gupta, Pawan Goyal
- Abstract summary: This study examines different approaches for building dialog systems using large language models (LLMs)
As part of prompt tuning, we experiment with various ways of providing instructions, exemplars, current query and additional context.
The research also analyzes the representations of dialog history that have the optimal usable-information density.
- Score: 17.048111072193933
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The use of large language models (LLMs) in natural language processing (NLP)
tasks is rapidly increasing, leading to changes in how researchers approach
problems in the field. To fully utilize these models' abilities, a better
understanding of their behavior for different input protocols is required. With
LLMs, users can directly interact with the models through a text-based
interface to define and solve various tasks. Hence, understanding the
conversational abilities of these LLMs, which may not have been specifically
trained for dialog modeling, is also important. This study examines different
approaches for building dialog systems using LLMs by considering various
aspects of the prompt. As part of prompt tuning, we experiment with various
ways of providing instructions, exemplars, current query and additional
context. The research also analyzes the representations of dialog history that
have the optimal usable-information density. Based on the findings, the paper
suggests more compact ways of providing dialog history information while
ensuring good performance and reducing model's inference-API costs. The
research contributes to a better understanding of how LLMs can be effectively
used for building interactive systems.
Related papers
- DivTOD: Unleashing the Power of LLMs for Diversifying Task-Oriented Dialogue Representations [21.814490079113323]
Language models pre-trained on general text have achieved impressive results in diverse fields.
Yet, the distinct linguistic characteristics of task-oriented dialogues (TOD) compared to general text limit the practical utility of existing language models.
We propose a novel dialogue pre-training model called DivTOD, which collaborates with LLMs to learn diverse task-oriented dialogue representations.
arXiv Detail & Related papers (2024-03-31T04:36:57Z) - Reasoning in Conversation: Solving Subjective Tasks through Dialogue
Simulation for Large Language Models [56.93074140619464]
We propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation.
The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales.
We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks.
arXiv Detail & Related papers (2024-02-27T05:37:10Z) - DialCLIP: Empowering CLIP as Multi-Modal Dialog Retriever [83.33209603041013]
We propose a parameter-efficient prompt-tuning method named DialCLIP for multi-modal dialog retrieval.
Our approach introduces a multi-modal context generator to learn context features which are distilled into prompts within the pre-trained vision-language model CLIP.
To facilitate various types of retrieval, we also design multiple experts to learn mappings from CLIP outputs to multi-modal representation space.
arXiv Detail & Related papers (2024-01-02T07:40:12Z) - Helping Language Models Learn More: Multi-dimensional Task Prompt for
Few-shot Tuning [36.14688633670085]
We propose MTPrompt, a multi-dimensional task prompt learning method based on task-related object, summary, and task description information.
By automatically building and searching for appropriate prompts, our proposed MTPrompt achieves the best results on few-shot samples setting and five different datasets.
arXiv Detail & Related papers (2023-12-13T10:00:44Z) - Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations [70.7884839812069]
Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks.
However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome.
In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue.
arXiv Detail & Related papers (2023-11-09T18:45:16Z) - Self-Explanation Prompting Improves Dialogue Understanding in Large
Language Models [52.24756457516834]
We propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of Large Language Models (LLMs)
This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks.
Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts.
arXiv Detail & Related papers (2023-09-22T15:41:34Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - Understanding the Effectiveness of Very Large Language Models on Dialog
Evaluation [20.18656308749408]
Large language models (LLMs) have been used for generation and can now output human-like text.
This paper investigates how the number of examples in the prompt and the type of example selection used affect the model's performance.
arXiv Detail & Related papers (2023-01-27T22:02:27Z) - In-Context Learning for Few-Shot Dialogue State Tracking [55.91832381893181]
We propose an in-context (IC) learning framework for few-shot dialogue state tracking (DST)
A large pre-trained language model (LM) takes a test instance and a few annotated examples as input, and directly decodes the dialogue states without any parameter updates.
This makes the LM more flexible and scalable compared to prior few-shot DST work when adapting to new domains and scenarios.
arXiv Detail & Related papers (2022-03-16T11:58:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.