FrugalMCT: Efficient Online ML API Selection for Multi-Label
Classification Tasks
- URL: http://arxiv.org/abs/2102.09127v1
- Date: Thu, 18 Feb 2021 02:59:58 GMT
- Title: FrugalMCT: Efficient Online ML API Selection for Multi-Label
Classification Tasks
- Authors: Lingjiao Chen and Matei Zaharia and James Zou
- Abstract summary: Multi-label classification tasks such as OCR are a major focus of the growing machine learning as a service industry.
We propose FrugalMCT, a principled framework that adaptively selects the APIs to use for different data in an online fashion while respecting user's budget.
We conduct systematic experiments using ML APIs from Google, Microsoft, Amazon, IBM, Tencent and other providers for tasks including multi-label image classification, scene text recognition and named entity recognition.
- Score: 27.35907550712252
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-label classification tasks such as OCR and multi-object recognition are
a major focus of the growing machine learning as a service industry. While many
multi-label prediction APIs are available, it is challenging for users to
decide which API to use for their own data and budget, due to the heterogeneity
in those APIs' price and performance. Recent work shows how to select from
single-label prediction APIs. However the computation complexity of the
previous approach is exponential in the number of labels and hence is not
suitable for settings like OCR. In this work, we propose FrugalMCT, a
principled framework that adaptively selects the APIs to use for different data
in an online fashion while respecting user's budget. The API selection problem
is cast as an integer linear program, which we show has a special structure
that we leverage to develop an efficient online API selector with strong
performance guarantees. We conduct systematic experiments using ML APIs from
Google, Microsoft, Amazon, IBM, Tencent and other providers for tasks including
multi-label image classification, scene text recognition and named entity
recognition. Across diverse tasks, FrugalMCT can achieve over 90% cost
reduction while matching the accuracy of the best single API, or up to 8%
better accuracy while matching the best API's cost.
Related papers
- A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How [53.65636914757381]
API suggestion is a critical task in modern software development.
Recent advancements in large code models (LCMs) have shown promise in the API suggestion task.
arXiv Detail & Related papers (2024-09-20T03:12:35Z) - FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability.
Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request.
We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z) - APICom: Automatic API Completion via Prompt Learning and Adversarial
Training-based Data Augmentation [6.029137544885093]
API recommendation is the process of assisting developers in finding the required API among numerous candidate APIs.
Previous studies mainly modeled API recommendation as the recommendation task, and developers may not yet be able to find what they need.
Motivated by the neural machine translation research domain, we can model this problem as the generation task.
We propose a novel approach APICom based on prompt learning, which can generate API related to the query according to the prompts.
arXiv Detail & Related papers (2023-09-13T15:31:50Z) - Evaluating Embedding APIs for Information Retrieval [51.24236853841468]
We evaluate the capabilities of existing semantic embedding APIs on domain generalization and multilingual retrieval.
We find that re-ranking BM25 results using the APIs is a budget-friendly approach and is most effective in English.
For non-English retrieval, re-ranking still improves the results, but a hybrid model with BM25 works best, albeit at a higher cost.
arXiv Detail & Related papers (2023-05-10T16:40:52Z) - HAPI: A Large-scale Longitudinal Dataset of Commercial ML API
Predictions [35.48276161473216]
We present HAPI, a longitudinal dataset of 1,761,417 instances of commercial ML API applications.
Each instance consists of a query input for an API along with the API's output prediction/annotation and confidence scores.
arXiv Detail & Related papers (2022-09-18T01:52:16Z) - On the Effectiveness of Pretrained Models for API Learning [8.788509467038743]
Developers frequently use APIs to implement certain functionalities, such as parsing Excel Files, reading and writing text files line by line, etc.
Developers can greatly benefit from automatic API usage sequence generation based on natural language queries for building applications in a faster and cleaner manner.
Existing approaches utilize information retrieval models to search for matching API sequences given a query or use RNN-based encoder-decoder to generate API sequences.
arXiv Detail & Related papers (2022-04-05T20:33:24Z) - Embedding Code Contexts for Cryptographic API Suggestion:New
Methodologies and Comparisons [9.011910726620536]
We present a new neural network-based approach, Multi-HyLSTM for API recommendation.
We use program analysis to guide the API embedding and recommendation.
In an analysis of 245 test cases, compared with the commercial tool Codota, we achieve a top-1 recommendation accuracy of 88.98%.
arXiv Detail & Related papers (2021-03-15T22:27:57Z) - Adaptive Self-training for Few-shot Neural Sequence Labeling [55.43109437200101]
We develop techniques to address the label scarcity challenge for neural sequence labeling models.
Self-training serves as an effective mechanism to learn from large amounts of unlabeled data.
meta-learning helps in adaptive sample re-weighting to mitigate error propagation from noisy pseudo-labels.
arXiv Detail & Related papers (2020-10-07T22:29:05Z) - FrugalML: How to Use ML Prediction APIs More Accurately and Cheaply [36.94826820536239]
We propose FrugalML, a principled framework that jointly learns the strength and weakness of each API on different data.
Our theoretical analysis shows that natural sparsity in the formulation can be leveraged to make FrugalML efficient.
Across various tasks, FrugalML can achieve up to 90% cost reduction while matching the accuracy of the best single API, or up to 5% better accuracy while matching the best API's cost.
arXiv Detail & Related papers (2020-06-12T23:43:23Z) - Learning Attentive Pairwise Interaction for Fine-Grained Classification [53.66543841939087]
We propose a simple but effective Attentive Pairwise Interaction Network (API-Net) for fine-grained classification.
API-Net first learns a mutual feature vector to capture semantic differences in the input pair.
It then compares this mutual vector with individual vectors to generate gates for each input image.
We conduct extensive experiments on five popular benchmarks in fine-grained classification.
arXiv Detail & Related papers (2020-02-24T12:17:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.