A Survey on Effective Invocation Methods of Massive LLM Services
- URL: http://arxiv.org/abs/2402.03408v2
- Date: Fri, 1 Mar 2024 03:33:21 GMT
- Title: A Survey on Effective Invocation Methods of Massive LLM Services
- Authors: Can Wang, Bolin Zhang, Dianbo Sui, Zhiying Tu, Xiaoyu Liu and Jiabao
Kang
- Abstract summary: Language models as a service (LM) enable users to accomplish tasks without requiring specialized knowledge, simply by paying a service provider.
Various providers offer massive large language model (LLM) services with variations in latency, performance, and pricing.
This paper provides a comprehensive overview of the LLM services invocation methods.
- Score: 9.21599372326452
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language models as a service (LMaaS) enable users to accomplish tasks without
requiring specialized knowledge, simply by paying a service provider. However,
numerous providers offer massive large language model (LLM) services with
variations in latency, performance, and pricing. Consequently, constructing the
cost-saving LLM services invocation strategy with low-latency and
high-performance responses that meet specific task demands becomes a pressing
challenge. This paper provides a comprehensive overview of the LLM services
invocation methods. Technically, we give a formal definition of the problem of
constructing effective invocation strategy in LMaaS and present the LLM
services invocation framework. The framework classifies existing methods into
four different components, including input abstract, semantic cache, solution
design, and output enhancement, which can be freely combined with each other.
Finally, we emphasize the open challenges that have not yet been well addressed
in this task and shed light on future research.
Related papers
- Plug-and-Play Performance Estimation for LLM Services without Relying on Labeled Data [8.360964737763657]
Large Language Model (LLM) services exhibit impressive capability on unlearned tasks leveraging only a few examples by in-context learning (ICL)
This paper introduces a novel method to estimate the performance of LLM services across different tasks and contexts.
arXiv Detail & Related papers (2024-10-10T09:15:14Z) - Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making [85.24399869971236]
We aim to evaluate Large Language Models (LLMs) for embodied decision making.
Existing evaluations tend to rely solely on a final success rate.
We propose a generalized interface (Embodied Agent Interface) that supports the formalization of various types of tasks.
arXiv Detail & Related papers (2024-10-09T17:59:00Z) - Sketch: A Toolkit for Streamlining LLM Operations [51.33202045501429]
Large language models (LLMs) have achieved remarkable success.
The flexibility of their output format poses challenges in controlling and harnessing the model's outputs.
We present Sketch, an innovative toolkit designed to streamline LLM operations across diverse fields.
arXiv Detail & Related papers (2024-09-05T08:45:44Z) - UniMEL: A Unified Framework for Multimodal Entity Linking with Large Language Models [0.42832989850721054]
Multimodal Entities Linking (MEL) is a crucial task that aims at linking ambiguous mentions within multimodal contexts to referent entities in a multimodal knowledge base, such as Wikipedia.
Existing methods overcomplicate the MEL task and overlook the visual semantic information, which makes them costly and hard to scale.
We propose UniMEL, a unified framework which establishes a new paradigm to process multimodal entity linking tasks using Large Language Models.
arXiv Detail & Related papers (2024-07-23T03:58:08Z) - Efficient Prompting for LLM-based Generative Internet of Things [88.84327500311464]
Large language models (LLMs) have demonstrated remarkable capacities on various tasks, and integrating the capacities of LLMs into the Internet of Things (IoT) applications has drawn much research attention recently.
Due to security concerns, many institutions avoid accessing state-of-the-art commercial LLM services, requiring the deployment and utilization of open-source LLMs in a local network setting.
We propose a LLM-based Generative IoT (GIoT) system deployed in the local network setting in this study.
arXiv Detail & Related papers (2024-06-14T19:24:00Z) - Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models.
It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop.
Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z) - Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [73.54562551341454]
Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs.
We propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer.
This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability.
arXiv Detail & Related papers (2024-01-14T16:17:07Z) - OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs.
Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.