Related papers: Automatic and Efficient Customization of Neural Networks for ML Applications

Automatic and Efficient Customization of Neural Networks for ML Applications

URL: http://arxiv.org/abs/2310.04685v1
Date: Sat, 7 Oct 2023 04:13:29 GMT
Title: Automatic and Efficient Customization of Neural Networks for ML Applications
Authors: Yuhan Liu, Chengcheng Wan, Kuntai Du, Henry Hoffmann, Junchen Jiang, Shan Lu, Michael Maire
Abstract summary: We propose ChameleonAPI, which takes effect without changing the application source code. ChameleonAPI uses the loss function to efficiently train a neural network model customized for each application. Compared to a baseline that selects the best-of-all commercial ML API, we show that ChameleonAPI reduces incorrect application decisions by 43%.
Score: 29.391143085794184
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: ML APIs have greatly relieved application developers of the burden to design and train their own neural network models -- classifying objects in an image can now be as simple as one line of Python code to call an API. However, these APIs offer the same pre-trained models regardless of how their output is used by different applications. This can be suboptimal as not all ML inference errors can cause application failures, and the distinction between inference errors that can or cannot cause failures varies greatly across applications. To tackle this problem, we first study 77 real-world applications, which collectively use six ML APIs from two providers, to reveal common patterns of how ML API output affects applications' decision processes. Inspired by the findings, we propose ChameleonAPI, an optimization framework for ML APIs, which takes effect without changing the application source code. ChameleonAPI provides application developers with a parser that automatically analyzes the application to produce an abstract of its decision process, which is then used to devise an application-specific loss function that only penalizes API output errors critical to the application. ChameleonAPI uses the loss function to efficiently train a neural network model customized for each application and deploys it to serve API invocations from the respective application via existing interface. Compared to a baseline that selects the best-of-all commercial ML API, we show that ChameleonAPI reduces incorrect application decisions by 43%.

Related papers

FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability. Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request. We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z)
WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment [49.00213183302225]
We propose a framework to induce new APIs by grounding wikiHow instruction to situated agent policies. Inspired by recent successes in large language models (LLMs) for embodied planning, we propose a few-shot prompting to steer GPT-4.
arXiv Detail & Related papers (2024-07-10T15:52:44Z)
A Solution-based LLM API-using Methodology for Academic Information Seeking [49.096714812902576]
SoAy is a solution-based LLM API-using methodology for academic information seeking. It uses code with a solution as the reasoning method, where a solution is a pre-constructed API calling sequence. Results show a 34.58-75.99% performance improvement compared to state-of-the-art LLM API-based baselines.
arXiv Detail & Related papers (2024-05-24T02:44:14Z)
Octopus: On-device language model for function calling of software APIs [9.78611123915888]
Large Language Models (LLMs) play a crucial role due to their advanced text processing and generation abilities. This study introduces a new strategy aimed at harnessing on-device LLMs in invoking software APIs.
arXiv Detail & Related papers (2024-04-02T01:29:28Z)
Continual Learning From a Stream of APIs [90.41825351073908]
Continual learning (CL) aims to learn new tasks without forgetting previous tasks. Existing CL methods require a large amount of raw data, which is often unavailable due to copyright considerations and privacy risks. This paper considers two practical-yet-novel CL settings: data-efficient CL (DECL-APIs) and data-free CL (DFCL-APIs)
arXiv Detail & Related papers (2023-08-31T11:16:00Z)
Mondrian: Prompt Abstraction Attack Against Large Language Models for Cheaper API Pricing [19.76564349397695]
We propose Mondrian, a simple and straightforward method that abstracts sentences, which can lower the cost of using LLM APIs. Our results show that Mondrian successfully reduces user queries' token length ranging from 13% to 23% across various tasks. As a result, the prompt abstraction attack enables the adversary to profit without bearing the cost of API development and deployment.
arXiv Detail & Related papers (2023-08-07T13:10:35Z)
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models [187.58051653991686]
Large language models (LLMs) have achieved remarkable progress in solving various natural language processing tasks. However, they have inherent limitations as they are incapable of accessing up-to-date information. We present Chameleon, an AI system that augments LLMs with plug-and-play modules for compositional reasoning.
arXiv Detail & Related papers (2023-04-19T17:47:47Z)
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions [35.48276161473216]
We present HAPI, a longitudinal dataset of 1,761,417 instances of commercial ML API applications. Each instance consists of a query input for an API along with the API's output prediction/annotation and confidence scores.
arXiv Detail & Related papers (2022-09-18T01:52:16Z)
Petals: Collaborative Inference and Fine-tuning of Large Models [78.37798144357977]
Many NLP tasks benefit from using large language models (LLMs) that often have more than 100 billion parameters. With the release of BLOOM-176B and OPT-175B, everyone can download pretrained models of this scale. We propose Petals $-$ a system for inference and fine-tuning of large models collaboratively by joining the resources of multiple parties.
arXiv Detail & Related papers (2022-09-02T17:38:03Z)
Improving the Learnability of Machine Learning APIs by Semi-Automated API Wrapping [0.0]
We address the challenge of creating APIs that are easy to learn and use, especially by novices. We investigate this problem for skl, a widely used ML API. We identify unused and apparently useless parts of the API that can be eliminated without affecting client programs.
arXiv Detail & Related papers (2022-03-29T12:42:05Z)
Did the Model Change? Efficiently Assessing Machine Learning API Shifts [24.342984907651505]
Machine learning (ML) prediction APIs are increasingly widely used. They can change over time due to model updates or retraining. It is often not clear to the user if and how the ML model has changed.
arXiv Detail & Related papers (2021-07-29T17:41:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.