EcoAssistant: Using LLM Assistant More Affordably and Accurately
- URL: http://arxiv.org/abs/2310.03046v1
- Date: Tue, 3 Oct 2023 22:16:13 GMT
- Title: EcoAssistant: Using LLM Assistant More Affordably and Accurately
- Authors: Jieyu Zhang, Ranjay Krishna, Ahmed H. Awadallah, Chi Wang
- Abstract summary: We contribute a framework, EcoAssistant, that enables Large language models to answer code-driven queries more affordably and accurately.
First, it allows the LLM assistants to converse with an automatic code executor to iteratively refine code or to produce answers based on the execution results.
Second, we use a hierarchy of LLM assistants, which attempts to answer the query with weaker, cheaper LLMs before backing off to stronger, expensive ones.
- Score: 36.29735258966917
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Today, users ask Large language models (LLMs) as assistants to answer queries
that require external knowledge; they ask about the weather in a specific city,
about stock prices, and even about where specific locations are within their
neighborhood. These queries require the LLM to produce code that invokes
external APIs to answer the user's question, yet LLMs rarely produce correct
code on the first try, requiring iterative code refinement upon execution
results. In addition, using LLM assistants to support high query volumes can be
expensive. In this work, we contribute a framework, EcoAssistant, that enables
LLMs to answer code-driven queries more affordably and accurately. EcoAssistant
contains three components. First, it allows the LLM assistants to converse with
an automatic code executor to iteratively refine code or to produce answers
based on the execution results. Second, we use a hierarchy of LLM assistants,
which attempts to answer the query with weaker, cheaper LLMs before backing off
to stronger, expensive ones. Third, we retrieve solutions from past successful
queries as in-context demonstrations to help subsequent queries. Empirically,
we show that EcoAssistant offers distinct advantages for affordability and
accuracy, surpassing GPT-4 by 10 points of success rate with less than 50% of
GPT-4's cost.
Related papers
- Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval [55.63711219190506]
Large language models (LLMs) often struggle with posing the right search queries.
We introduce $underlineLe$arning to $underlineRe$trieve by $underlineT$rying (LeReT)
LeReT can improve the absolute retrieval accuracy by up to 29% and the downstream generator evaluations by 17%.
arXiv Detail & Related papers (2024-10-30T17:02:54Z) - Optimizing LLM Queries in Relational Workloads [58.254894049950366]
We show how to optimize Large Language Models (LLMs) inference for analytical workloads that invoke LLMs within relational queries.
We implement these optimizations in Apache Spark, with vLLM as the model serving backend.
We achieve up to 4.4x improvement in end-to-end latency on a benchmark of diverse LLM-based queries on real datasets.
arXiv Detail & Related papers (2024-03-09T07:01:44Z) - Query-OPT: Optimizing Inference of Large Language Models via Multi-Query Instructions in Meeting Summarization [7.674972936853123]
We investigate whether combining the queries for the same input context in a single prompt to minimize repeated calls can be successfully used in meeting summarization.
We observe that 100% reliability in generating the response in the expected format is usually limited to certain closed-source LLMs.
arXiv Detail & Related papers (2024-02-29T19:00:47Z) - Why and When LLM-Based Assistants Can Go Wrong: Investigating the
Effectiveness of Prompt-Based Interactions for Software Help-Seeking [5.755004576310333]
Large Language Model (LLM) assistants have emerged as potential alternatives to search methods for helping users navigate software.
LLM assistants use vast training data from domain-specific texts, software manuals, and code repositories to mimic human-like interactions.
arXiv Detail & Related papers (2024-02-12T19:49:58Z) - LLatrieval: LLM-Verified Retrieval for Verifiable Generation [67.93134176912477]
Verifiable generation aims to let the large language model (LLM) generate text with supporting documents.
We propose LLatrieval (Large Language Model Verified Retrieval), where the LLM updates the retrieval result until it verifies that the retrieved documents can sufficiently support answering the question.
Experiments show that LLatrieval significantly outperforms extensive baselines and achieves state-of-the-art results.
arXiv Detail & Related papers (2023-11-14T01:38:02Z) - Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves [57.974103113675795]
We present a method named Rephrase and Respond' (RaR) which allows Large Language Models to rephrase and expand questions posed by humans.
RaR serves as a simple yet effective prompting method for improving performance.
We show that RaR is complementary to the popular Chain-of-Thought (CoT) methods, both theoretically and empirically.
arXiv Detail & Related papers (2023-11-07T18:43:34Z) - FrugalGPT: How to Use Large Language Models While Reducing Cost and
Improving Performance [36.94826820536239]
We review the cost associated with querying popular large language models (LLMs)
We discuss three types of strategies that users can exploit to reduce the inference cost associated with using LLMs.
Experiments show that FrugalGPT can match the performance of the best individual LLM with up to 98% cost reduction or improve the accuracy over GPT-4 by 4% with the same cost.
arXiv Detail & Related papers (2023-05-09T05:11:02Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.