Related papers: Sherpa: Robust Hyperparameter Optimization for Machine Learning

Sherpa: Robust Hyperparameter Optimization for Machine Learning

URL: http://arxiv.org/abs/2005.04048v1
Date: Fri, 8 May 2020 13:52:49 GMT
Title: Sherpa: Robust Hyperparameter Optimization for Machine Learning
Authors: Lars Hertel, Julian Collado, Peter Sadowski, Jordan Ott, Pierre Baldi
Abstract summary: Sherpa is a hyperparameter optimization library for machine learning models. It is specifically designed for problems with computationally expensive, iterative function evaluations. Sherpa can be run on either a single machine or in parallel on a cluster.
Score: 6.156647008180291
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sherpa is a hyperparameter optimization library for machine learning models. It is specifically designed for problems with computationally expensive, iterative function evaluations, such as the hyperparameter tuning of deep neural networks. With Sherpa, scientists can quickly optimize hyperparameters using a variety of powerful and interchangeable algorithms. Sherpa can be run on either a single machine or in parallel on a cluster. Finally, an interactive dashboard enables users to view the progress of models as they are trained, cancel trials, and explore which hyperparameter combinations are working best. Sherpa empowers machine learning practitioners by automating the more tedious aspects of model tuning. Its source code and documentation are available at https://github.com/sherpa-ai/sherpa.

Related papers

PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning [54.99373314906667]
Self-supervised representation learning for point cloud has demonstrated effectiveness in improving pre-trained model performance across diverse tasks. As pre-trained models grow in complexity, fully fine-tuning them for downstream applications demands substantial computational and storage resources. We propose PointLoRA, a simple yet effective method that combines low-rank adaptation (LoRA) with multi-scale token selection to efficiently fine-tune point cloud models.
arXiv Detail & Related papers (2025-04-22T16:41:21Z)
Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining [56.58170370127227]
We show that optimal learning rate follows a power-law relationship with both model parameters and data sizes, while optimal batch size scales primarily with data sizes. This work is the first work that unifies different model shapes and structures, such as Mixture-of-Experts models and dense transformers.
arXiv Detail & Related papers (2025-03-06T18:58:29Z)
Tune As You Scale: Hyperparameter Optimization For Compute Efficient Training [0.0]
We propose a practical method for robustly tuning large models. CarBS performs local search around the performance-cost frontier. Among our results, we effectively solve the entire ProcGen benchmark just by tuning a simple baseline.
arXiv Detail & Related papers (2023-06-13T18:22:24Z)
PyHopper -- Hyperparameter optimization [51.40201315676902]
We present PyHopper, a black-box optimization platform for machine learning researchers. PyHopper's goal is to integrate with existing code with minimal effort and run the optimization process with minimal necessary manual oversight. With simplicity as the primary theme, PyHopper is powered by a single robust Markov-chain Monte-Carlo optimization algorithm.
arXiv Detail & Related papers (2022-10-10T14:35:01Z)
Automating DBSCAN via Deep Reinforcement Learning [73.82740568765279]
We propose a novel Deep Reinforcement Learning guided automatic DBSCAN parameters search framework, namely DRL-DBSCAN. The framework models the process of adjusting the parameter search direction by perceiving the clustering environment as a Markov decision process. The framework consistently improves DBSCAN clustering accuracy by up to 26% and 25% respectively.
arXiv Detail & Related papers (2022-08-09T04:40:11Z)
Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction. Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z)
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning. We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z)
Towards Robust and Automatic Hyper-Parameter Tunning [39.04604349338802]
We introduce a new class of HPO method and explore how the low-rank factorization of intermediate layers of a convolutional network can be used to define an analytical response surface. We quantify how this surface behaves as a surrogate to model performance and can be solved using a trust-region search algorithm, which we call autoHyper.
arXiv Detail & Related papers (2021-11-28T05:27:34Z)
Hyperparameter Tuning is All You Need for LISTA [92.7008234085887]
Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) introduces the concept of unrolling an iterative algorithm and training it like a neural network. We show that adding momentum to intermediate variables in the LISTA network achieves a better convergence rate. We call this new ultra-light weight network HyperLISTA.
arXiv Detail & Related papers (2021-10-29T16:35:38Z)
Experimental Investigation and Evaluation of Model-based Hyperparameter Optimization [0.3058685580689604]
This article presents an overview of theoretical and practical results for popular machine learning algorithms. The R package mlr is used as a uniform interface to the machine learning models.
arXiv Detail & Related papers (2021-07-19T11:37:37Z)
HyperNP: Interactive Visual Exploration of Multidimensional Projection Hyperparameters [61.354362652006834]
HyperNP is a scalable method that allows for real-time interactive exploration of projection methods by training neural network approximations. We evaluate the performance of the HyperNP across three datasets in terms of performance and speed.
arXiv Detail & Related papers (2021-06-25T17:28:14Z)
Surrogate Model Based Hyperparameter Tuning for Deep Learning with SPOT [0.40611352512781856]
This article demonstrates how the architecture-level parameters of deep learning models that were implemented in Keras/tensorflow can be optimized. The implementation of the tuning procedure is 100 % based on R, the software environment for statistical computing.
arXiv Detail & Related papers (2021-05-30T21:16:51Z)
MANGO: A Python Library for Parallel Hyperparameter Tuning [4.728291880913813]
We present Mango, a Python library for parallel hyperparameter tuning. Mango enables the use of any distributed scheduling framework. It implements intelligent parallel search strategies, and provides rich abstractions.
arXiv Detail & Related papers (2020-05-22T20:58:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.