Related papers: FrugalML: How to Use ML Prediction APIs More Accurately and Cheaply

FrugalML: How to Use ML Prediction APIs More Accurately and Cheaply

URL: http://arxiv.org/abs/2006.07512v1
Date: Fri, 12 Jun 2020 23:43:23 GMT
Title: FrugalML: How to Use ML Prediction APIs More Accurately and Cheaply
Authors: Lingjiao Chen, Matei Zaharia, James Zou
Abstract summary: We propose FrugalML, a principled framework that jointly learns the strength and weakness of each API on different data. Our theoretical analysis shows that natural sparsity in the formulation can be leveraged to make FrugalML efficient. Across various tasks, FrugalML can achieve up to 90% cost reduction while matching the accuracy of the best single API, or up to 5% better accuracy while matching the best API's cost.
Score: 36.94826820536239
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Prediction APIs offered for a fee are a fast-growing industry and an important part of machine learning as a service. While many such services are available, the heterogeneity in their price and performance makes it challenging for users to decide which API or combination of APIs to use for their own data and budget. We take a first step towards addressing this challenge by proposing FrugalML, a principled framework that jointly learns the strength and weakness of each API on different data, and performs an efficient optimization to automatically identify the best sequential strategy to adaptively use the available APIs within a budget constraint. Our theoretical analysis shows that natural sparsity in the formulation can be leveraged to make FrugalML efficient. We conduct systematic experiments using ML APIs from Google, Microsoft, Amazon, IBM, Baidu and other providers for tasks including facial emotion recognition, sentiment analysis and speech recognition. Across various tasks, FrugalML can achieve up to 90% cost reduction while matching the accuracy of the best single API, or up to 5% better accuracy while matching the best API's cost.

Related papers

Framework-Aware Code Generation with API Knowledge Graph-Constructed Data: A Study on HarmonyOS [52.483888557864326]
APIKG4SYN is a framework designed to exploit API knowledge graphs for the construction of API-oriented question-code pairs.<n>We build the first benchmark for HarmonyOS code generation using APIKG4SYN.
arXiv Detail & Related papers (2025-11-29T08:13:54Z)
APIRAT: Integrating Multi-source API Knowledge for Enhanced Code Translation with LLMs [6.522570957351905]
APIRAT is a novel code translation method that integrates multi-source API knowledge. APIRAT employs three API knowledge augmentation techniques, including API sequence retrieval, API sequence back-translation, and API mapping. Experiments indicate that APIRAT significantly surpasses existing LLM-based methods, achieving improvements in computational accuracy ranging from 4% to 15.1%.
arXiv Detail & Related papers (2025-04-21T04:24:49Z)
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [60.881609323604685]
Large Language Models (LLMs) accessed via black-box APIs introduce a trust challenge. Users pay for services based on advertised model capabilities. providers may covertly substitute the specified model with a cheaper, lower-quality alternative to reduce operational costs. This lack of transparency undermines fairness, erodes trust, and complicates reliable benchmarking.
arXiv Detail & Related papers (2025-04-07T03:57:41Z)
FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability. Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request. We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z)
ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents [7.166156709980112]
We introduce textscShortcutsBench, a large-scale benchmark for the comprehensive evaluation of API-based agents. textscShortcutsBench includes a wealth of real APIs from Apple Inc.'s operating systems. Our evaluation reveals significant limitations in handling complex queries related to API selection, parameter filling, and requesting necessary information from systems and users.
arXiv Detail & Related papers (2024-06-28T08:45:02Z)
A Solution-based LLM API-using Methodology for Academic Information Seeking [49.096714812902576]
SoAy is a solution-based LLM API-using methodology for academic information seeking. It uses code with a solution as the reasoning method, where a solution is a pre-constructed API calling sequence. Results show a 34.58-75.99% performance improvement compared to state-of-the-art LLM API-based baselines.
arXiv Detail & Related papers (2024-05-24T02:44:14Z)
Adaptive REST API Testing with Reinforcement Learning [54.68542517176757]
Current testing tools lack efficient exploration mechanisms, treating all operations and parameters equally. Current tools struggle when response schemas are absent in the specification or exhibit variants. We present an adaptive REST API testing technique incorporates reinforcement learning to prioritize operations during exploration.
arXiv Detail & Related papers (2023-09-08T20:27:05Z)
OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs. Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z)
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs [66.30706841821123]
Large language models (LLMs) power many state-of-the-art systems in natural language processing. LLMs are extremely computationally expensive, even at inference time. We propose a new metric for comparing inference efficiency across models.
arXiv Detail & Related papers (2023-05-03T21:51:42Z)
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions [35.48276161473216]
We present HAPI, a longitudinal dataset of 1,761,417 instances of commercial ML API applications. Each instance consists of a query input for an API along with the API's output prediction/annotation and confidence scores.
arXiv Detail & Related papers (2022-09-18T01:52:16Z)
Cost Effective MLaaS Federation: A Combinatorial Reinforcement Learning Approach [9.50492686145041]
Federating different MLes together allows us to improve the analytic performance further. naively aggregating results from different MLes not only incurs significant momentary cost but also may lead to sub-optimal performance gain. We propose a framework fed Armol to unify the right selection of ML providers to achieve the best possible analytic performance.
arXiv Detail & Related papers (2022-04-29T09:44:04Z)
Improving the Learnability of Machine Learning APIs by Semi-Automated API Wrapping [0.0]
We address the challenge of creating APIs that are easy to learn and use, especially by novices. We investigate this problem for skl, a widely used ML API. We identify unused and apparently useless parts of the API that can be eliminated without affecting client programs.
arXiv Detail & Related papers (2022-03-29T12:42:05Z)
FrugalMCT: Efficient Online ML API Selection for Multi-Label Classification Tasks [27.35907550712252]
Multi-label classification tasks such as OCR are a major focus of the growing machine learning as a service industry. We propose FrugalMCT, a principled framework that adaptively selects the APIs to use for different data in an online fashion while respecting user's budget. We conduct systematic experiments using ML APIs from Google, Microsoft, Amazon, IBM, Tencent and other providers for tasks including multi-label image classification, scene text recognition and named entity recognition.
arXiv Detail & Related papers (2021-02-18T02:59:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.