Desiderata for next generation of ML model serving
- URL: http://arxiv.org/abs/2210.14665v1
- Date: Wed, 26 Oct 2022 12:29:25 GMT
- Title: Desiderata for next generation of ML model serving
- Authors: Sherif Akoush, Andrei Paleyes, Arnaud Van Looveren and Clive Cox
- Abstract summary: This paper puts forth a range of important qualities that next generation of inference platforms should be aiming for.
An overarching design pattern is data-centricity, which enables smarter monitoring in ML system operation.
- Score: 0.34410212782758054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inference is a significant part of ML software infrastructure. Despite the
variety of inference frameworks available, the field as a whole can be
considered in its early days. This paper puts forth a range of important
qualities that next generation of inference platforms should be aiming for. We
present our rationale for the importance of each quality, and discuss ways to
achieve it in practice. An overarching design pattern is data-centricity, which
enables smarter monitoring in ML system operation.
Related papers
- Matchmaker: Self-Improving Large Language Model Programs for Schema Matching [60.23571456538149]
We propose a compositional language model program for schema matching, comprised of candidate generation, refinement and confidence scoring.
Matchmaker self-improves in a zero-shot manner without the need for labeled demonstrations.
Empirically, we demonstrate on real-world medical schema matching benchmarks that Matchmaker outperforms previous ML-based approaches.
arXiv Detail & Related papers (2024-10-31T16:34:03Z) - Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate [118.37653302885607]
We present the Modality Integration Rate (MIR), an effective, robust, and generalized metric to indicate the multi-modal pre-training quality of Large Vision Language Models (LVLMs)
MIR is indicative about training data selection, training strategy schedule, and model architecture design to get better pre-training results.
arXiv Detail & Related papers (2024-10-09T17:59:04Z) - MIP-GAF: A MLLM-annotated Benchmark for Most Important Person Localization and Group Context Understanding [12.572321050617571]
Estimating the Most Important Person (MIP) in any social event setup is a challenging problem due to contextual complexity and scarcity of labeled data.
We aim to address the problem by annotating a large-scale in-the-wild' dataset for identifying human perceptions about MIP in an image.
The proposed dataset will play a vital role in building the next-generation social situation understanding methods.
arXiv Detail & Related papers (2024-09-10T05:28:38Z) - A Survey on Efficient Inference for Large Language Models [25.572035747669275]
Large Language Models (LLMs) have attracted extensive attention due to their remarkable performance across various tasks.
The substantial computational and memory requirements of LLM inference pose challenges for deployment in resource-constrained scenarios.
This paper presents a comprehensive survey of the existing literature on efficient LLM inference.
arXiv Detail & Related papers (2024-04-22T15:53:08Z) - A Large-Scale Evaluation of Speech Foundation Models [110.95827399522204]
We establish the Speech processing Universal PERformance Benchmark (SUPERB) to study the effectiveness of the foundation model paradigm for speech.
We propose a unified multi-tasking framework to address speech processing tasks in SUPERB using a frozen foundation model followed by task-specialized, lightweight prediction heads.
arXiv Detail & Related papers (2024-04-15T00:03:16Z) - Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods.
We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z) - Data-centric Operational Design Domain Characterization for Machine
Learning-based Aeronautical Products [4.8461049669050915]
We give first rigorous characterization of Operational Design Domains (ODDs) for Machine Learning (ML)-based aeronautical products.
We propose the dimensions along which the parameters that define an ODD can be explicitly captured, together with a categorization of the data that ML-based applications can encounter in operation.
arXiv Detail & Related papers (2023-07-15T02:08:33Z) - Exploring the potential of flow-based programming for machine learning
deployment in comparison with service-oriented architectures [8.677012233188968]
We argue that part of the reason is infrastructure that was not designed for activities around data collection and analysis.
We propose to consider flow-based programming with data streams as an alternative to commonly used service-oriented architectures for building software applications.
arXiv Detail & Related papers (2021-08-09T15:06:02Z) - The Benchmark Lottery [114.43978017484893]
"A benchmark lottery" describes the overall fragility of the machine learning benchmarking process.
We show that the relative performance of algorithms may be altered significantly simply by choosing different benchmark tasks.
arXiv Detail & Related papers (2021-07-14T21:08:30Z) - Counterfactual Explanations for Machine Learning on Multivariate Time
Series Data [0.9274371635733836]
This paper proposes a novel explainability technique for providing counterfactual explanations for supervised machine learning frameworks.
The proposed method outperforms state-of-the-art explainability methods on several different ML frameworks and data sets in metrics such as faithfulness and robustness.
arXiv Detail & Related papers (2020-08-25T02:04:59Z) - A Survey on Large-scale Machine Learning [67.6997613600942]
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions.
Most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data.
Large-scale Machine Learning aims to learn patterns from big data with comparable performance efficiently.
arXiv Detail & Related papers (2020-08-10T06:07:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.