OrchestraLLM: Efficient Orchestration of Language Models for Dialogue
State Tracking
- URL: http://arxiv.org/abs/2311.09758v2
- Date: Wed, 28 Feb 2024 20:41:25 GMT
- Title: OrchestraLLM: Efficient Orchestration of Language Models for Dialogue
State Tracking
- Authors: Chia-Hsuan Lee, Hao Cheng, Mari Ostendorf
- Abstract summary: Large language models (LLMs) have revolutionized the landscape of Natural Language Processing systems, but are computationally expensive.
Previous studies have explored various approaches to harness the potential of Small Language Models (SLMs) as cost-effective alternatives to their larger counterparts.
This work presents a novel SLM/LLM routing framework designed to improve computational efficiency and enhance task performance.
- Score: 18.047158624848546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have revolutionized the landscape of Natural
Language Processing systems, but are computationally expensive. To reduce the
cost without sacrificing performance, previous studies have explored various
approaches to harness the potential of Small Language Models (SLMs) as
cost-effective alternatives to their larger counterparts. Driven by findings
that SLMs and LLMs exhibit complementary strengths in a structured knowledge
extraction task, this work presents a novel SLM/LLM routing framework designed
to improve computational efficiency and enhance task performance. First,
exemplar pools are created to represent the types of contexts where each LM
provides a more reliable answer, leveraging a sentence embedding fine-tuned so
that context similarity is close to dialogue state similarity. Then, during
inference, the k-nearest exemplars to the testing instance are retrieved, and
the instance is routed according to majority vote. In dialogue state tracking
tasks, the proposed routing framework enhances performance substantially
compared to relying solely on LLMs, while reducing the computational costs by
over 50%.
Related papers
- Unlocking the Potential of Model Merging for Low-Resource Languages [66.7716891808697]
Adapting large language models to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT)
We propose model merging as an alternative for low-resource languages, combining models with distinct capabilities into a single model without additional training.
Experiments based on Llama-2-7B demonstrate that model merging effectively endows LLMs for low-resource languages with task-solving abilities, outperforming CT-then-SFT in scenarios with extremely scarce data.
arXiv Detail & Related papers (2024-07-04T15:14:17Z) - DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning [19.93800175353809]
DeTriever is a novel demonstration retrieval framework that learns a weighted combination of hidden states.
Our method significantly outperforms the state-of-the-art baselines on one-shot NL2 tasks.
arXiv Detail & Related papers (2024-06-12T06:33:54Z) - Text-Video Retrieval with Global-Local Semantic Consistent Learning [122.15339128463715]
We propose a simple yet effective method, Global-Local Semantic Consistent Learning (GLSCL)
GLSCL capitalizes on latent shared semantics across modalities for text-video retrieval.
Our method achieves comparable performance with SOTA as well as being nearly 220 times faster in terms of computational cost.
arXiv Detail & Related papers (2024-05-21T11:59:36Z) - Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning [79.38140606606126]
We propose an algorithmic framework that fine-tunes vision-language models (VLMs) with reinforcement learning (RL)
Our framework provides a task description and then prompts the VLM to generate chain-of-thought (CoT) reasoning.
We demonstrate that our proposed framework enhances the decision-making capabilities of VLM agents across various tasks.
arXiv Detail & Related papers (2024-05-16T17:50:19Z) - ParaICL: Towards Robust Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing.
Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.
We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z) - TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned
Decision [32.24857534147114]
Large language model (LLM) agents have been built for different tasks like web navigation and online shopping.
In this paper, we propose a novel framework (TRAD) to address these issues.
TRAD conducts Thought Retrieval, achieving step-level demonstration selection via thought matching.
Then, TRAD introduces Aligned Decision, complementing retrieved demonstration steps with their previous or subsequent steps.
arXiv Detail & Related papers (2024-03-10T13:58:38Z) - Learning to Reduce: Optimal Representations of Structured Data in
Prompting Large Language Models [42.16047343029512]
Large Language Models (LLMs) have been widely used as general-purpose AI agents.
We propose a framework, Learning to Reduce, that fine-tunes a language model to generate a reduced version of an input context.
We show that our model achieves comparable accuracies in selecting the relevant evidence from an input context.
arXiv Detail & Related papers (2024-02-22T00:41:23Z) - LLM-augmented Preference Learning from Natural Language [19.700169351688768]
Large Language Models (LLMs) are equipped to deal with larger context lengths.
LLMs can consistently outperform the SotA when the target text is large.
Few-shot learning yields better performance than zero-shot learning.
arXiv Detail & Related papers (2023-10-12T17:17:27Z) - OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs.
Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z) - In-Context Learning for Few-Shot Dialogue State Tracking [55.91832381893181]
We propose an in-context (IC) learning framework for few-shot dialogue state tracking (DST)
A large pre-trained language model (LM) takes a test instance and a few annotated examples as input, and directly decodes the dialogue states without any parameter updates.
This makes the LM more flexible and scalable compared to prior few-shot DST work when adapting to new domains and scenarios.
arXiv Detail & Related papers (2022-03-16T11:58:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.