A Declarative System for Optimizing AI Workloads
- URL: http://arxiv.org/abs/2405.14696v2
- Date: Wed, 29 May 2024 15:27:07 GMT
- Title: A Declarative System for Optimizing AI Workloads
- Authors: Chunwei Liu, Matthew Russo, Michael Cafarella, Lei Cao, Peter Baille Chen, Zui Chen, Michael Franklin, Tim Kraska, Samuel Madden, Gerardo Vitagliano,
- Abstract summary: Palimpzest is a system that enables anyone to process AI-powered analytical queries simply by defining them in a declarative language.
We describe the workload of AI-powered analytics tasks, the optimization methods that Palimpzest uses, and the prototype system itself.
We show that even our simple prototype offers a range of appealing plans, including one that is 3.3x faster and 2.9x cheaper than the baseline method.
- Score: 14.302404377396837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A long-standing goal of data management systems has been to build systems which can compute quantitative insights over large corpora of unstructured data in a cost-effective manner. Until recently, it was difficult and expensive to extract facts from company documents, data from scientific papers, or metrics from image and video corpora. Today's models can accomplish these tasks with high accuracy. However, a programmer who wants to answer a substantive AI-powered query must orchestrate large numbers of models, prompts, and data operations. For even a single query, the programmer has to make a vast number of decisions such as the choice of model, the right inference method, the most cost-effective inference hardware, the ideal prompt design, and so on. The optimal set of decisions can change as the query changes and as the rapidly-evolving technical landscape shifts. In this paper we present Palimpzest, a system that enables anyone to process AI-powered analytical queries simply by defining them in a declarative language. The system uses its cost optimization framework to implement the query plan with the best trade-offs between runtime, financial cost, and output data quality. We describe the workload of AI-powered analytics tasks, the optimization methods that Palimpzest uses, and the prototype system itself. We evaluate Palimpzest on tasks in Legal Discovery, Real Estate Search, and Medical Schema Matching. We show that even our simple prototype offers a range of appealing plans, including one that is 3.3x faster and 2.9x cheaper than the baseline method, while also offering better data quality. With parallelism enabled, Palimpzest can produce plans with up to a 90.3x speedup at 9.1x lower cost relative to a single-threaded GPT-4 baseline, while obtaining an F1-score within 83.5% of the baseline. These require no additional work by the user.
Related papers
- Benchmarking LLMs for Optimization Modeling and Enhancing Reasoning via Reverse Socratic Synthesis [60.23133327001978]
Large language models (LLMs) have exhibited their problem-solving ability in mathematical reasoning.
We propose E-OPT, a benchmark for end-to-end optimization problem-solving with human-readable inputs and outputs.
arXiv Detail & Related papers (2024-07-13T13:27:57Z) - Automatic AI Model Selection for Wireless Systems: Online Learning via Digital Twinning [50.332027356848094]
AI-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control.
The mapping between context and AI model parameters is ideally done in a zero-shot fashion.
This paper introduces a general methodology for the online optimization of AMS mappings.
arXiv Detail & Related papers (2024-06-22T11:17:50Z) - Cheaply Evaluating Inference Efficiency Metrics for Autoregressive
Transformer APIs [66.30706841821123]
Large language models (LLMs) power many state-of-the-art systems in natural language processing.
LLMs are extremely computationally expensive, even at inference time.
We propose a new metric for comparing inference efficiency across models.
arXiv Detail & Related papers (2023-05-03T21:51:42Z) - Bayesian Optimization Over Iterative Learners with Structured Responses:
A Budget-aware Planning Approach [31.918476422203412]
This paper proposes a novel approach referred to as Budget-Aware Planning for Iterative learners (BAPI) to solve HPO problems under a constrained cost budget.
Experiments on diverse HPO benchmarks for iterative learners show that BAPI performs better than state-of-the-art baselines in most of the cases.
arXiv Detail & Related papers (2022-06-25T18:44:06Z) - Uncertainty-Aware Search Framework for Multi-Objective Bayesian
Optimization [40.40632890861706]
We consider the problem of multi-objective (MO) blackbox optimization using expensive function evaluations.
We propose a novel uncertainty-aware search framework referred to as USeMO to efficiently select the sequence of inputs for evaluation.
arXiv Detail & Related papers (2022-04-12T16:50:48Z) - $\{\text{PF}\}^2\text{ES}$: Parallel Feasible Pareto Frontier Entropy
Search for Multi-Objective Bayesian Optimization Under Unknown Constraints [4.672142224503371]
We present a novel information-theoretic acquisition function for multi-objective Bayesian optimization.
$textPF2$ES provides a low cost and accurate estimate of the mutual information for the parallel setting.
We benchmark $textPF2$ES across synthetic and real-life problems.
arXiv Detail & Related papers (2022-04-11T21:06:23Z) - IMO$^3$: Interactive Multi-Objective Off-Policy Optimization [45.2918894257473]
A system designer needs to find a policy that trades off objectives to reach a desired operating point.
We propose interactive multi-objective off-policy optimization (IMO$3$)
We show that IMO$3$ identifies a near-optimal policy with high probability.
arXiv Detail & Related papers (2022-01-24T16:51:41Z) - An Experimental Design Perspective on Model-Based Reinforcement Learning [73.37942845983417]
In practical applications of RL, it is expensive to observe state transitions from the environment.
We propose an acquisition function that quantifies how much information a state-action pair would provide about the optimal solution to a Markov decision process.
arXiv Detail & Related papers (2021-12-09T23:13:57Z) - Conservative Objective Models for Effective Offline Model-Based
Optimization [78.19085445065845]
Computational design problems arise in a number of settings, from synthetic biology to computer architectures.
We propose a method that learns a model of the objective function that lower bounds the actual value of the ground-truth objective on out-of-distribution inputs.
COMs are simple to implement and outperform a number of existing methods on a wide range of MBO problems.
arXiv Detail & Related papers (2021-07-14T17:55:28Z) - Offline Model-Based Optimization via Normalized Maximum Likelihood
Estimation [101.22379613810881]
We consider data-driven optimization problems where one must maximize a function given only queries at a fixed set of points.
This problem setting emerges in many domains where function evaluation is a complex and expensive process.
We propose a tractable approximation that allows us to scale our method to high-capacity neural network models.
arXiv Detail & Related papers (2021-02-16T06:04:27Z) - A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation,
Cost Model, and Plan Enumeration [17.75042918159419]
A cost-based algorithm is adopted in almost all current database systems.
In the cost model, cardinality, the number of the numbers through an operator plays a crucial role.
Due to the inaccuracy in cardinality estimation, errors in cost, and the huge plan space model, the algorithm cannot find the optimal execution plan for a complex query in a reasonable time.
arXiv Detail & Related papers (2021-01-05T13:47:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.