Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization
- URL: http://arxiv.org/abs/2502.00871v1
- Date: Sun, 02 Feb 2025 18:45:28 GMT
- Title: Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization
- Authors: Szymon Sieradzki, Jacek MaĆdziuk,
- Abstract summary: We propose several modifications to the Adaptive Tree-Structured Parzen Estimator (ATPE) algorithm.
Experimental results demonstrate that the proposed modifications significantly improve the effectiveness of ATPE hyper parameter optimization.
- Score: 0.0
- License:
- Abstract: In this paper, we review hyperparameter optimization methods for machine learning models, with a particular focus on the Adaptive Tree-Structured Parzen Estimator (ATPE) algorithm. We propose several modifications to ATPE and assess their efficacy on a diverse set of standard benchmark functions. Experimental results demonstrate that the proposed modifications significantly improve the effectiveness of ATPE hyperparameter optimization on selected benchmarks, a finding that holds practical relevance for their application in real-world machine learning / optimization tasks.
Related papers
- Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation [67.13876021157887]
Dynamic Tuning (DyT) is a novel approach to improve both parameter and inference efficiency for ViT adaptation.
DyT achieves superior performance compared to existing PEFT methods while evoking only 71% of their FLOPs on the VTAB-1K benchmark.
arXiv Detail & Related papers (2024-03-18T14:05:52Z) - End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - MADA: Meta-Adaptive Optimizers through hyper-gradient Descent [73.1383658672682]
We introduce Meta-Adaptives (MADA), a unified framework that can generalize several known convergences and dynamically learn the most suitable one during training.
We empirically compare MADA to other populars on vision and language tasks, and find that MADA consistently outperforms Adam and other populars.
We also propose AVGrad, a modification of AMS that replaces the maximum operator with averaging, which is more suitable for hyper-gradient optimization.
arXiv Detail & Related papers (2024-01-17T00:16:46Z) - Tree-Structured Parzen Estimator: Understanding Its Algorithm Components
and Their Roles for Better Empirical Performance [1.370633147306388]
Tree-structured Parzen estimator (TPE) is widely used in recent parameter tuning frameworks.
Despite its popularity, the roles of each control parameter and the algorithm intuition have not been discussed so far.
arXiv Detail & Related papers (2023-04-21T17:02:38Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Adaptive Optimizer for Automated Hyperparameter Optimization Problem [0.0]
In this paper, we present a general framework that is able to construct an adaptive framework, which automatically adjust the appropriate parameters in the process of optimization.
arXiv Detail & Related papers (2022-01-28T13:58:10Z) - Additive Tree-Structured Conditional Parameter Spaces in Bayesian
Optimization: A Novel Covariance Function and a Fast Implementation [34.89735938765757]
We generalize the additive assumption to tree-structured functions, showing improved sample-efficiency, wider applicability and greater flexibility.
By incorporating the structure information of parameter spaces and the additive assumption in the BO loop, we develop a parallel algorithm to optimize the acquisition function.
We demonstrate our method on an optimization benchmark function, on pruning pre-trained VGG16 and Res50 models as well as on searching activation functions of ResNet20.
arXiv Detail & Related papers (2020-10-06T16:08:58Z) - Bayesian Optimization for Selecting Efficient Machine Learning Models [53.202224677485525]
We present a unified Bayesian Optimization framework for jointly optimizing models for both prediction effectiveness and training efficiency.
Experiments on model selection for recommendation tasks indicate models selected this way significantly improves model training efficiency.
arXiv Detail & Related papers (2020-08-02T02:56:30Z) - NOVAS: Non-convex Optimization via Adaptive Stochastic Search for
End-to-End Learning and Control [22.120942106939122]
We propose the use of adaptive search as a building block for general, non- neural optimization operations.
We benchmark it against two existing alternatives on a synthetic energy-based structured task, and showcase its use in optimal control applications.
arXiv Detail & Related papers (2020-06-22T03:40:36Z) - Additive Tree-Structured Covariance Function for Conditional Parameter
Spaces in Bayesian Optimization [34.89735938765757]
We generalize the additive assumption to tree-structured functions.
By incorporating the structure information of parameter spaces and the additive assumption in the BO loop, we develop a parallel algorithm to optimize the acquisition function.
arXiv Detail & Related papers (2020-06-21T11:21:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.