A bi-objective $\epsilon$-constrained framework for quality-cost
optimization in language model ensembles
- URL: http://arxiv.org/abs/2312.16119v1
- Date: Tue, 26 Dec 2023 16:56:22 GMT
- Title: A bi-objective $\epsilon$-constrained framework for quality-cost
optimization in language model ensembles
- Authors: Aditi Singla, Aditya Singh, Kanishk Kukreja
- Abstract summary: We propose an ensembling framework that uses diverse open-sourced Large Language Models (LLMs) to achieve high response quality while maintaining cost efficiency.
We formulate a bi-objective optimization problem to represent the quality-cost tradeoff and then introduce an additional budget constraint that reduces the problem to a straightforward 0/1 knapsack problem.
- Score: 1.5039745292757671
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We propose an ensembling framework that uses diverse open-sourced Large
Language Models (LLMs) to achieve high response quality while maintaining cost
efficiency. We formulate a bi-objective optimization problem to represent the
quality-cost tradeoff and then introduce an additional budget constraint that
reduces the problem to a straightforward 0/1 knapsack problem. We empirically
demonstrate that our framework outperforms the existing ensembling approaches
in response quality while significantly reducing costs.
Related papers
- Robust personalized pricing under uncertainty of purchase probabilities [2.9061423802698565]
We propose a robust optimization model for personalized pricing that accounts for the uncertainty of predicted purchase probabilities.
We also develop a Lagrangian decomposition algorithm combined with line search to efficiently find high-quality solutions for large-scale optimization problems.
arXiv Detail & Related papers (2024-07-22T02:36:19Z) - Evolve Cost-aware Acquisition Functions Using Large Language Models [11.209139558885035]
This paper introduces EvolCAF, a novel framework that integrates large language models (LLMs) with evolutionary computation (EC) to automatically design cost-aware AFs.
The designed cost-aware AF maximizes the utilization of available information from historical data, surrogate models and budget details.
In comparison to the well-known EIpu and EI-cool methods designed by human experts, our approach showcases remarkable efficiency and generalization across various tasks.
arXiv Detail & Related papers (2024-04-25T12:19:18Z) - Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing [53.748685766139715]
Large language models (LLMs) excel in most NLP tasks but also require expensive cloud servers for deployment due to their size.
We propose a hybrid inference approach which combines their respective strengths to save cost and maintain quality.
In experiments our approach allows us to make up to 40% fewer calls to the large model, with no drop in response quality.
arXiv Detail & Related papers (2024-04-22T23:06:42Z) - Approaching Collateral Optimization for NISQ and Quantum-Inspired
Computing [0.0]
Collateral optimization refers to the systematic allocation of financial assets to satisfy obligations or secure transactions.
One of the common objectives is to minimise the cost of collateral required to mitigate the risk associated with a particular transaction or a portfolio of transactions.
arXiv Detail & Related papers (2023-05-25T18:01:04Z) - Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions.
We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers.
We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z) - A Unifying Framework for Online Optimization with Long-Term Constraints [62.35194099438855]
We study online learning problems in which a decision maker has to take a sequence of decisions subject to $m$ long-term constraints.
The goal is to maximize their total reward, while at the same time achieving small cumulative violation across the $T$ rounds.
We present the first best-of-both-world type algorithm for this general class problems, with no-regret guarantees both in the case in which rewards and constraints are selected according to an unknown model, and in the case in which they are selected at each round by an adversary.
arXiv Detail & Related papers (2022-09-15T16:59:19Z) - Algorithm for Constrained Markov Decision Process with Linear
Convergence [55.41644538483948]
An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs.
A new dual approach is proposed with the integration of two ingredients: entropy regularized policy and Vaidya's dual.
The proposed approach is shown to converge (with linear rate) to the global optimum.
arXiv Detail & Related papers (2022-06-03T16:26:38Z) - Online Learning with Knapsacks: the Best of Both Worlds [54.28273783164608]
We casting online learning problems in which a decision maker wants to maximize their expected reward without violating a finite set of $m$m resource constraints.
Our framework allows the decision maker to handle its evidence flexibility and costoretic functions.
arXiv Detail & Related papers (2022-02-28T12:10:48Z) - Automatically Learning Compact Quality-aware Surrogates for Optimization
Problems [55.94450542785096]
Solving optimization problems with unknown parameters requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values.
Recent work has shown that including the optimization problem as a layer in a complex training model pipeline results in predictions of iteration of unobserved decision making.
We show that we can improve solution quality by learning a low-dimensional surrogate model of a large optimization problem.
arXiv Detail & Related papers (2020-06-18T19:11:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.