Hyperparameter optimization of data-driven AI models on HPC systems
- URL: http://arxiv.org/abs/2203.01112v1
- Date: Wed, 2 Mar 2022 14:02:59 GMT
- Title: Hyperparameter optimization of data-driven AI models on HPC systems
- Authors: Eric Wulff and Maria Girone and Joosep Pata
- Abstract summary: This work is part of RAISE's work on data-driven use cases which leverages AI- and HPC cross-methods.
It is shown that in the case of Machine-Learned Particle reconstruction in High Energy Physics, the ASHA algorithm in combination with Bayesian optimization gives the largest performance increase per compute resources spent out of the investigated algorithms.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the European Center of Excellence in Exascale computing "Research on AI-
and Simulation-Based Engineering at Exascale" (CoE RAISE), researchers develop
novel, scalable AI technologies towards Exascale. This work exercises High
Performance Computing resources to perform large-scale hyperparameter
optimization using distributed training on multiple compute nodes. This is part
of RAISE's work on data-driven use cases which leverages AI- and HPC
cross-methods developed within the project. In response to the demand for
parallelizable and resource efficient hyperparameter optimization methods,
advanced hyperparameter search algorithms are benchmarked and compared. The
evaluated algorithms, including Random Search, Hyperband and ASHA, are tested
and compared in terms of both accuracy and accuracy per compute resources
spent. As an example use case, a graph neural network model known as MLPF,
developed for the task of Machine-Learned Particle-Flow reconstruction in High
Energy Physics, acts as the base model for optimization. Results show that
hyperparameter optimization significantly increased the performance of MLPF and
that this would not have been possible without access to large-scale High
Performance Computing resources. It is also shown that, in the case of MLPF,
the ASHA algorithm in combination with Bayesian optimization gives the largest
performance increase per compute resources spent out of the investigated
algorithms.
Related papers
- Model Performance Prediction for Hyperparameter Optimization of Deep
Learning Models Using High Performance Computing and Quantum Annealing [0.0]
We show that integrating model performance prediction with early stopping methods holds great potential to speed up the HPO process of deep learning models.
We propose a novel algorithm called Swift-Hyperband that can use either classical or quantum support vector regression for performance prediction.
arXiv Detail & Related papers (2023-11-29T10:32:40Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - AxOMaP: Designing FPGA-based Approximate Arithmetic Operators using
Mathematical Programming [2.898055875927704]
We propose a data analysis-driven mathematical programming-based approach to synthesizing approximate operators for FPGAs.
Specifically, we formulate mixed integer quadratically constrained programs based on the results of correlation analysis of the characterization data.
Compared to traditional evolutionary algorithms-based optimization, we report up to 21% improvement in the hypervolume, for joint optimization of PPA and BEHAV.
arXiv Detail & Related papers (2023-09-23T18:23:54Z) - Optimization of a Hydrodynamic Computational Reservoir through Evolution [58.720142291102135]
We interface with a model of a hydrodynamic system, under development by a startup, as a computational reservoir.
We optimized the readout times and how inputs are mapped to the wave amplitude or frequency using an evolutionary search algorithm.
Applying evolutionary methods to this reservoir system substantially improved separability on an XNOR task, in comparison to implementations with hand-selected parameters.
arXiv Detail & Related papers (2023-04-20T19:15:02Z) - Hyperparameter optimization, quantum-assisted model performance
prediction, and benchmarking of AI-based High Energy Physics workloads using
HPC [0.0]
This work studies the potential of using model performance prediction to aid the HPO process carried out on High Performance Computing systems.
A quantum annealer is used to train the performance predictor and a method is proposed to overcome some of the problems derived from the current limitations in quantum systems.
Results are presented from the development of a containerized benchmark based on an AI-model for collision event reconstruction.
arXiv Detail & Related papers (2023-03-27T09:55:33Z) - Two-step hyperparameter optimization method: Accelerating hyperparameter
search by using a fraction of a training dataset [0.15420205433587747]
We present a two-step HPO method as a strategic solution to curbing computational demands and wait times.
We present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation.
arXiv Detail & Related papers (2023-02-08T02:38:26Z) - Multi-objective hyperparameter optimization with performance uncertainty [62.997667081978825]
This paper presents results on multi-objective hyperparameter optimization with uncertainty on the evaluation of Machine Learning algorithms.
We combine the sampling strategy of Tree-structured Parzen Estimators (TPE) with the metamodel obtained after training a Gaussian Process Regression (GPR) with heterogeneous noise.
Experimental results on three analytical test functions and three ML problems show the improvement over multi-objective TPE and GPR.
arXiv Detail & Related papers (2022-09-09T14:58:43Z) - Towards Robust and Automatic Hyper-Parameter Tunning [39.04604349338802]
We introduce a new class of HPO method and explore how the low-rank factorization of intermediate layers of a convolutional network can be used to define an analytical response surface.
We quantify how this surface behaves as a surrogate to model performance and can be solved using a trust-region search algorithm, which we call autoHyper.
arXiv Detail & Related papers (2021-11-28T05:27:34Z) - ES-Based Jacobian Enables Faster Bilevel Optimization [53.675623215542515]
Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems.
Existing gradient-based methods require second-order derivative approximations via Jacobian- or/and Hessian-vector computations.
We propose a novel BO algorithm, which adopts Evolution Strategies (ES) based method to approximate the response Jacobian matrix in the hypergradient of BO.
arXiv Detail & Related papers (2021-10-13T19:36:50Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - An Asymptotically Optimal Multi-Armed Bandit Algorithm and
Hyperparameter Optimization [48.5614138038673]
We propose an efficient and robust bandit-based algorithm called Sub-Sampling (SS) in the scenario of hyper parameter search evaluation.
We also develop a novel hyper parameter optimization algorithm called BOSS.
Empirical studies validate our theoretical arguments of SS and demonstrate the superior performance of BOSS on a number of applications.
arXiv Detail & Related papers (2020-07-11T03:15:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.