Hyperparameter optimization of data-driven AI models on HPC systems
- URL: http://arxiv.org/abs/2203.01112v1
- Date: Wed, 2 Mar 2022 14:02:59 GMT
- Title: Hyperparameter optimization of data-driven AI models on HPC systems
- Authors: Eric Wulff and Maria Girone and Joosep Pata
- Abstract summary: This work is part of RAISE's work on data-driven use cases which leverages AI- and HPC cross-methods.
It is shown that in the case of Machine-Learned Particle reconstruction in High Energy Physics, the ASHA algorithm in combination with Bayesian optimization gives the largest performance increase per compute resources spent out of the investigated algorithms.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the European Center of Excellence in Exascale computing "Research on AI-
and Simulation-Based Engineering at Exascale" (CoE RAISE), researchers develop
novel, scalable AI technologies towards Exascale. This work exercises High
Performance Computing resources to perform large-scale hyperparameter
optimization using distributed training on multiple compute nodes. This is part
of RAISE's work on data-driven use cases which leverages AI- and HPC
cross-methods developed within the project. In response to the demand for
parallelizable and resource efficient hyperparameter optimization methods,
advanced hyperparameter search algorithms are benchmarked and compared. The
evaluated algorithms, including Random Search, Hyperband and ASHA, are tested
and compared in terms of both accuracy and accuracy per compute resources
spent. As an example use case, a graph neural network model known as MLPF,
developed for the task of Machine-Learned Particle-Flow reconstruction in High
Energy Physics, acts as the base model for optimization. Results show that
hyperparameter optimization significantly increased the performance of MLPF and
that this would not have been possible without access to large-scale High
Performance Computing resources. It is also shown that, in the case of MLPF,
the ASHA algorithm in combination with Bayesian optimization gives the largest
performance increase per compute resources spent out of the investigated
algorithms.
Related papers
- Value-Based Deep RL Scales Predictably [100.21834069400023]
We show that value-based off-policy RL methods are predictable despite community lore regarding their pathological behavior.
We validate our approach using three algorithms: SAC, BRO, and PQL on DeepMind Control, OpenAI gym, and IsaacGym.
arXiv Detail & Related papers (2025-02-06T18:59:47Z) - A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation.
deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency.
This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z) - Resource-Adaptive Successive Doubling for Hyperparameter Optimization with Large Datasets on High-Performance Computing Systems [0.4334105740533729]
This article proposes a novel Resource-Adaptive Successive Doubling Algorithm (RASDA)
It combines a resource-adaptive successive doubling scheme with the plain Asynchronous Successive Halving Algorithm (ASHA)
It is applied to different types of Neural Networks (NNs) and trained on large datasets from the Computer Vision (CV), Computational Fluid Dynamics (CFD, and Additive Manufacturing (AM) domains.
arXiv Detail & Related papers (2024-12-03T11:25:48Z) - Model Performance Prediction for Hyperparameter Optimization of Deep
Learning Models Using High Performance Computing and Quantum Annealing [0.0]
We show that integrating model performance prediction with early stopping methods holds great potential to speed up the HPO process of deep learning models.
We propose a novel algorithm called Swift-Hyperband that can use either classical or quantum support vector regression for performance prediction.
arXiv Detail & Related papers (2023-11-29T10:32:40Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - AxOMaP: Designing FPGA-based Approximate Arithmetic Operators using
Mathematical Programming [2.898055875927704]
We propose a data analysis-driven mathematical programming-based approach to synthesizing approximate operators for FPGAs.
Specifically, we formulate mixed integer quadratically constrained programs based on the results of correlation analysis of the characterization data.
Compared to traditional evolutionary algorithms-based optimization, we report up to 21% improvement in the hypervolume, for joint optimization of PPA and BEHAV.
arXiv Detail & Related papers (2023-09-23T18:23:54Z) - Hyperparameter optimization, quantum-assisted model performance
prediction, and benchmarking of AI-based High Energy Physics workloads using
HPC [0.0]
This work studies the potential of using model performance prediction to aid the HPO process carried out on High Performance Computing systems.
A quantum annealer is used to train the performance predictor and a method is proposed to overcome some of the problems derived from the current limitations in quantum systems.
Results are presented from the development of a containerized benchmark based on an AI-model for collision event reconstruction.
arXiv Detail & Related papers (2023-03-27T09:55:33Z) - Multi-objective hyperparameter optimization with performance uncertainty [62.997667081978825]
This paper presents results on multi-objective hyperparameter optimization with uncertainty on the evaluation of Machine Learning algorithms.
We combine the sampling strategy of Tree-structured Parzen Estimators (TPE) with the metamodel obtained after training a Gaussian Process Regression (GPR) with heterogeneous noise.
Experimental results on three analytical test functions and three ML problems show the improvement over multi-objective TPE and GPR.
arXiv Detail & Related papers (2022-09-09T14:58:43Z) - Towards Robust and Automatic Hyper-Parameter Tunning [39.04604349338802]
We introduce a new class of HPO method and explore how the low-rank factorization of intermediate layers of a convolutional network can be used to define an analytical response surface.
We quantify how this surface behaves as a surrogate to model performance and can be solved using a trust-region search algorithm, which we call autoHyper.
arXiv Detail & Related papers (2021-11-28T05:27:34Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - An Asymptotically Optimal Multi-Armed Bandit Algorithm and
Hyperparameter Optimization [48.5614138038673]
We propose an efficient and robust bandit-based algorithm called Sub-Sampling (SS) in the scenario of hyper parameter search evaluation.
We also develop a novel hyper parameter optimization algorithm called BOSS.
Empirical studies validate our theoretical arguments of SS and demonstrate the superior performance of BOSS on a number of applications.
arXiv Detail & Related papers (2020-07-11T03:15:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.