Related papers: Frugal Prompting for Dialog Models

Frugal Prompting for Dialog Models

URL: http://arxiv.org/abs/2305.14919v2
Date: Sun, 5 Nov 2023 06:05:19 GMT
Title: Frugal Prompting for Dialog Models
Authors: Bishal Santra, Sakya Basak, Abhinandan De, Manish Gupta, Pawan Goyal
Abstract summary: This study examines different approaches for building dialog systems using large language models (LLMs) As part of prompt tuning, we experiment with various ways of providing instructions, exemplars, current query and additional context. The research also analyzes the representations of dialog history that have the optimal usable-information density.
Score: 17.048111072193933
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The use of large language models (LLMs) in natural language processing (NLP) tasks is rapidly increasing, leading to changes in how researchers approach problems in the field. To fully utilize these models' abilities, a better understanding of their behavior for different input protocols is required. With LLMs, users can directly interact with the models through a text-based interface to define and solve various tasks. Hence, understanding the conversational abilities of these LLMs, which may not have been specifically trained for dialog modeling, is also important. This study examines different approaches for building dialog systems using LLMs by considering various aspects of the prompt. As part of prompt tuning, we experiment with various ways of providing instructions, exemplars, current query and additional context. The research also analyzes the representations of dialog history that have the optimal usable-information density. Based on the findings, the paper suggests more compact ways of providing dialog history information while ensuring good performance and reducing model's inference-API costs. The research contributes to a better understanding of how LLMs can be effectively used for building interactive systems.

Related papers

DivTOD: Unleashing the Power of LLMs for Diversifying Task-Oriented Dialogue Representations [21.814490079113323]
Language models pre-trained on general text have achieved impressive results in diverse fields. Yet, the distinct linguistic characteristics of task-oriented dialogues (TOD) compared to general text limit the practical utility of existing language models. We propose a novel dialogue pre-training model called DivTOD, which collaborates with LLMs to learn diverse task-oriented dialogue representations.
arXiv Detail & Related papers (2024-03-31T04:36:57Z)
Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models [56.93074140619464]
We propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation. The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales. We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks.
arXiv Detail & Related papers (2024-02-27T05:37:10Z)
DialCLIP: Empowering CLIP as Multi-Modal Dialog Retriever [83.33209603041013]
We propose a parameter-efficient prompt-tuning method named DialCLIP for multi-modal dialog retrieval. Our approach introduces a multi-modal context generator to learn context features which are distilled into prompts within the pre-trained vision-language model CLIP. To facilitate various types of retrieval, we also design multiple experts to learn mappings from CLIP outputs to multi-modal representation space.
arXiv Detail & Related papers (2024-01-02T07:40:12Z)
Helping Language Models Learn More: Multi-dimensional Task Prompt for Few-shot Tuning [36.14688633670085]
We propose MTPrompt, a multi-dimensional task prompt learning method based on task-related object, summary, and task description information. By automatically building and searching for appropriate prompts, our proposed MTPrompt achieves the best results on few-shot samples setting and five different datasets.
arXiv Detail & Related papers (2023-12-13T10:00:44Z)
Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations [70.7884839812069]
Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks. However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome. In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue.
arXiv Detail & Related papers (2023-11-09T18:45:16Z)
Self-Explanation Prompting Improves Dialogue Understanding in Large Language Models [52.24756457516834]
We propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of Large Language Models (LLMs) This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks. Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts.
arXiv Detail & Related papers (2023-09-22T15:41:34Z)
Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial. We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z)
Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation [20.18656308749408]
Large language models (LLMs) have been used for generation and can now output human-like text. This paper investigates how the number of examples in the prompt and the type of example selection used affect the model's performance.
arXiv Detail & Related papers (2023-01-27T22:02:27Z)
In-Context Learning for Few-Shot Dialogue State Tracking [55.91832381893181]
We propose an in-context (IC) learning framework for few-shot dialogue state tracking (DST) A large pre-trained language model (LM) takes a test instance and a few annotated examples as input, and directly decodes the dialogue states without any parameter updates. This makes the LM more flexible and scalable compared to prior few-shot DST work when adapting to new domains and scenarios.
arXiv Detail & Related papers (2022-03-16T11:58:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.