Related papers: Amazon SageMaker Automatic Model Tuning: Scalable Black-box Optimization

Amazon SageMaker Automatic Model Tuning: Scalable Black-box Optimization

URL: http://arxiv.org/abs/2012.08489v1
Date: Tue, 15 Dec 2020 18:34:34 GMT
Title: Amazon SageMaker Automatic Model Tuning: Scalable Black-box Optimization
Authors: Valerio Perrone, Huibin Shen, Aida Zolic, Iaroslav Shcherbatyi, Amr Ahmed, Tanya Bansal, Michele Donini, Fela Winkelmolen, Rodolphe Jenatton, Jean Baptiste Faddoul, Barbara Pogorzelska, Miroslav Miladinovic, Krishnaram Kenthapadi, Matthias Seeger, C\'edric Archambeau
Abstract summary: Amazon SageMaker Automatic Model Tuning (AMT) is a fully managed system for black-box optimization at scale. AMT finds the best version of a machine learning model by repeatedly training it with different hyperparameter configurations. It can be used with built-in algorithms, custom algorithms, and Amazon SageMaker pre-built containers for machine learning frameworks.
Score: 23.52446054521187
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tuning complex machine learning systems is challenging. Machine learning models typically expose a set of hyperparameters, be it regularization, architecture, or optimization parameters, whose careful tuning is critical to achieve good performance. To democratize access to such systems, it is essential to automate this tuning process. This paper presents Amazon SageMaker Automatic Model Tuning (AMT), a fully managed system for black-box optimization at scale. AMT finds the best version of a machine learning model by repeatedly training it with different hyperparameter configurations. It leverages either random search or Bayesian optimization to choose the hyperparameter values resulting in the best-performing model, as measured by the metric chosen by the user. AMT can be used with built-in algorithms, custom algorithms, and Amazon SageMaker pre-built containers for machine learning frameworks. We discuss the core functionality, system architecture and our design principles. We also describe some more advanced features provided by AMT, such as automated early stopping and warm-starting, demonstrating their benefits in experiments.

Related papers

AutoHete: An Automatic and Efficient Heterogeneous Training System for LLMs [68.99086112477565]
Transformer-based large language models (LLMs) have demonstrated exceptional capabilities in sequence modeling and text generation. Existing heterogeneous training methods significantly expand the scale of trainable models but introduce substantial communication overheads and CPU workloads. We propose AutoHete, an automatic and efficient heterogeneous training system compatible with both single- GPU and multi- GPU environments.
arXiv Detail & Related papers (2025-02-27T14:46:22Z)
SigOpt Mulch: An Intelligent System for AutoML of Gradient Boosted Trees [3.6449336503217786]
Gradient boosted trees (GBTs) are ubiquitous models used by researchers, machine learning (ML) practitioners, and data scientists. We present SigOpt Mulch, a model-aware hyperparameter tuning system specifically designed for automated tuning of GBTs.
arXiv Detail & Related papers (2023-07-10T18:40:25Z)
Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges. Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning. A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z)
Deep learning based Auto Tuning for Database Management System [0.12891210250935148]
In this work, we extend an automated technique based on Ottertune to reuse data gathered from previous sessions to tune new deployments with the help of supervised and unsupervised machine learning methods to improve latency prediction. We use GMM clustering to prune metrics and combine ensemble models, such as RandomForest, with non-linear models, like neural networks, for prediction modeling.
arXiv Detail & Related papers (2023-04-25T11:52:52Z)
VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles. We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates. We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z)
Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors. In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori. Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z)
Re-parameterizing Your Optimizers rather than Architectures [119.08740698936633]
We propose a novel paradigm of incorporating model-specific prior knowledge into Structurals and using them to train generic (simple) models. As an implementation, we propose a novel methodology to add prior knowledge by modifying the gradients according to a set of model-specific hyper- parameters. For a simple model trained with a Repr, we focus on a VGG-style plain model and showcase that such a simple model trained with a Repr, which is referred to as Rep-VGG, performs on par with the recent well-designed models.
arXiv Detail & Related papers (2022-05-30T16:55:59Z)
Mining Robust Default Configurations for Resource-constrained AutoML [18.326426020906215]
We present a novel method of selecting performant configurations for a given task by performing offline autoML and mining over a diverse set of tasks. We show that our approach is effective for warm-starting existing autoML platforms.
arXiv Detail & Related papers (2022-02-20T23:08:04Z)
To tune or not to tune? An Approach for Recommending Important Hyperparameters [2.121963121603413]
We consider building the relationship between the performance of the machine learning models and their hyperparameters to discover the trend and gain insights. Our results enable users to decide whether it is worth conducting a possibly time-consuming tuning strategy.
arXiv Detail & Related papers (2021-08-30T08:54:58Z)
Experimental Investigation and Evaluation of Model-based Hyperparameter Optimization [0.3058685580689604]
This article presents an overview of theoretical and practical results for popular machine learning algorithms. The R package mlr is used as a uniform interface to the machine learning models.
arXiv Detail & Related papers (2021-07-19T11:37:37Z)
AutoFIS: Automatic Feature Interaction Selection in Factorization Models for Click-Through Rate Prediction [75.16836697734995]
We propose a two-stage algorithm called Automatic Feature Interaction Selection (AutoFIS) AutoFIS can automatically identify important feature interactions for factorization models with computational cost just equivalent to training the target model to convergence. AutoFIS has been deployed onto the training platform of Huawei App Store recommendation service.
arXiv Detail & Related papers (2020-03-25T06:53:54Z)
Stepwise Model Selection for Sequence Prediction via Deep Kernel Learning [100.83444258562263]
We propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting. In order to solve the resulting multiple black-box function optimization problem jointly and efficiently, we exploit potential correlations among black-box functions. We are the first to formulate the problem of stepwise model selection (SMS) for sequence prediction, and to design and demonstrate an efficient joint-learning algorithm for this purpose.
arXiv Detail & Related papers (2020-01-12T09:42:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.