Amazon SageMaker Automatic Model Tuning: Scalable Black-box Optimization
- URL: http://arxiv.org/abs/2012.08489v1
- Date: Tue, 15 Dec 2020 18:34:34 GMT
- Title: Amazon SageMaker Automatic Model Tuning: Scalable Black-box Optimization
- Authors: Valerio Perrone, Huibin Shen, Aida Zolic, Iaroslav Shcherbatyi, Amr
Ahmed, Tanya Bansal, Michele Donini, Fela Winkelmolen, Rodolphe Jenatton,
Jean Baptiste Faddoul, Barbara Pogorzelska, Miroslav Miladinovic, Krishnaram
Kenthapadi, Matthias Seeger, C\'edric Archambeau
- Abstract summary: Amazon SageMaker Automatic Model Tuning (AMT) is a fully managed system for black-box optimization at scale.
AMT finds the best version of a machine learning model by repeatedly training it with different hyperparameter configurations.
It can be used with built-in algorithms, custom algorithms, and Amazon SageMaker pre-built containers for machine learning frameworks.
- Score: 23.52446054521187
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tuning complex machine learning systems is challenging. Machine learning
models typically expose a set of hyperparameters, be it regularization,
architecture, or optimization parameters, whose careful tuning is critical to
achieve good performance. To democratize access to such systems, it is
essential to automate this tuning process. This paper presents Amazon SageMaker
Automatic Model Tuning (AMT), a fully managed system for black-box optimization
at scale. AMT finds the best version of a machine learning model by repeatedly
training it with different hyperparameter configurations. It leverages either
random search or Bayesian optimization to choose the hyperparameter values
resulting in the best-performing model, as measured by the metric chosen by the
user. AMT can be used with built-in algorithms, custom algorithms, and Amazon
SageMaker pre-built containers for machine learning frameworks. We discuss the
core functionality, system architecture and our design principles. We also
describe some more advanced features provided by AMT, such as automated early
stopping and warm-starting, demonstrating their benefits in experiments.
Related papers
- SigOpt Mulch: An Intelligent System for AutoML of Gradient Boosted Trees [3.6449336503217786]
Gradient boosted trees (GBTs) are ubiquitous models used by researchers, machine learning (ML) practitioners, and data scientists.
We present SigOpt Mulch, a model-aware hyperparameter tuning system specifically designed for automated tuning of GBTs.
arXiv Detail & Related papers (2023-07-10T18:40:25Z) - Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges.
Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning.
A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z) - Deep learning based Auto Tuning for Database Management System [0.12891210250935148]
In this work, we extend an automated technique based on Ottertune to reuse data gathered from previous sessions to tune new deployments with the help of supervised and unsupervised machine learning methods to improve latency prediction.
We use GMM clustering to prune metrics and combine ensemble models, such as RandomForest, with non-linear models, like neural networks, for prediction modeling.
arXiv Detail & Related papers (2023-04-25T11:52:52Z) - VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles.
We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates.
We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z) - Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors.
In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori.
Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z) - Re-parameterizing Your Optimizers rather than Architectures [119.08740698936633]
We propose a novel paradigm of incorporating model-specific prior knowledge into Structurals and using them to train generic (simple) models.
As an implementation, we propose a novel methodology to add prior knowledge by modifying the gradients according to a set of model-specific hyper- parameters.
For a simple model trained with a Repr, we focus on a VGG-style plain model and showcase that such a simple model trained with a Repr, which is referred to as Rep-VGG, performs on par with the recent well-designed models.
arXiv Detail & Related papers (2022-05-30T16:55:59Z) - Mining Robust Default Configurations for Resource-constrained AutoML [18.326426020906215]
We present a novel method of selecting performant configurations for a given task by performing offline autoML and mining over a diverse set of tasks.
We show that our approach is effective for warm-starting existing autoML platforms.
arXiv Detail & Related papers (2022-02-20T23:08:04Z) - To tune or not to tune? An Approach for Recommending Important
Hyperparameters [2.121963121603413]
We consider building the relationship between the performance of the machine learning models and their hyperparameters to discover the trend and gain insights.
Our results enable users to decide whether it is worth conducting a possibly time-consuming tuning strategy.
arXiv Detail & Related papers (2021-08-30T08:54:58Z) - Experimental Investigation and Evaluation of Model-based Hyperparameter
Optimization [0.3058685580689604]
This article presents an overview of theoretical and practical results for popular machine learning algorithms.
The R package mlr is used as a uniform interface to the machine learning models.
arXiv Detail & Related papers (2021-07-19T11:37:37Z) - AutoFIS: Automatic Feature Interaction Selection in Factorization Models
for Click-Through Rate Prediction [75.16836697734995]
We propose a two-stage algorithm called Automatic Feature Interaction Selection (AutoFIS)
AutoFIS can automatically identify important feature interactions for factorization models with computational cost just equivalent to training the target model to convergence.
AutoFIS has been deployed onto the training platform of Huawei App Store recommendation service.
arXiv Detail & Related papers (2020-03-25T06:53:54Z) - Stepwise Model Selection for Sequence Prediction via Deep Kernel
Learning [100.83444258562263]
We propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting.
In order to solve the resulting multiple black-box function optimization problem jointly and efficiently, we exploit potential correlations among black-box functions.
We are the first to formulate the problem of stepwise model selection (SMS) for sequence prediction, and to design and demonstrate an efficient joint-learning algorithm for this purpose.
arXiv Detail & Related papers (2020-01-12T09:42:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.