Related papers: Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles for Better Empirical Performance

Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles for Better Empirical Performance

URL: http://arxiv.org/abs/2304.11127v3
Date: Fri, 26 May 2023 10:09:07 GMT
Title: Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles for Better Empirical Performance
Authors: Shuhei Watanabe
Abstract summary: Tree-structured Parzen estimator (TPE) is widely used in recent parameter tuning frameworks. Despite its popularity, the roles of each control parameter and the algorithm intuition have not been discussed so far.
Score: 1.370633147306388
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in many domains require more and more complicated experiment design. Such complicated experiments often have many parameters, which necessitate parameter tuning. Tree-structured Parzen estimator (TPE), a Bayesian optimization method, is widely used in recent parameter tuning frameworks. Despite its popularity, the roles of each control parameter and the algorithm intuition have not been discussed so far. In this tutorial, we will identify the roles of each control parameter and their impacts on hyperparameter optimization using a diverse set of benchmarks. We compare our recommended setting drawn from the ablation study with baseline methods and demonstrate that our recommended setting improves the performance of TPE. Our TPE implementation is available at https://github.com/nabenabe0928/tpe/tree/single-opt.

Related papers

Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization [0.0]
We propose several modifications to the Adaptive Tree-Structured Parzen Estimator (ATPE) algorithm. Experimental results demonstrate that the proposed modifications significantly improve the effectiveness of ATPE hyper parameter optimization.
arXiv Detail & Related papers (2025-02-02T18:45:28Z)
ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts. Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z)
Scaling Exponents Across Parameterizations and Optimizers [94.54718325264218]
We propose a new perspective on parameterization by investigating a key assumption in prior work. Our empirical investigation includes tens of thousands of models trained with all combinations of threes. We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work.
arXiv Detail & Related papers (2024-07-08T12:32:51Z)
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation [67.13876021157887]
Dynamic Tuning (DyT) is a novel approach to improve both parameter and inference efficiency for ViT adaptation. DyT achieves superior performance compared to existing PEFT methods while evoking only 71% of their FLOPs on the VTAB-1K benchmark.
arXiv Detail & Related papers (2024-03-18T14:05:52Z)
A Unified Gaussian Process for Branching and Nested Hyperparameter Optimization [19.351804144005744]
In deep learning, tuning parameters with conditional dependence are common in practice. New GP model accounts for the dependent structure among input variables through a new kernel function. High prediction accuracy and better optimization efficiency are observed in a series of synthetic simulations and real data applications of neural networks.
arXiv Detail & Related papers (2024-01-19T21:11:32Z)
Hyperparameters in Reinforcement Learning and How To Tune Them [25.782420501870295]
We show that hyper parameter choices in deep reinforcement learning can significantly affect the agent's final performance and sample efficiency. We propose adopting established best practices from AutoML, such as the separation of tuning and testing seeds. We support this by comparing state-of-the-art HPO tools on a range of RL algorithms and environments to their hand-tuned counterparts.
arXiv Detail & Related papers (2023-06-02T07:48:18Z)
Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual. sensuous-aware fine-Tuning (SPT) scheme. SPT allocates trainable parameters to task-specific important positions. Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z)
AutoPEFT: Automatic Configuration Search for Parameter-Efficient Fine-Tuning [77.61565726647784]
Motivated by advances in neural architecture search, we propose AutoPEFT for automatic PEFT configuration selection. We show that AutoPEFT-discovered configurations significantly outperform existing PEFT methods and are on par or better than FFT without incurring substantial training efficiency costs.
arXiv Detail & Related papers (2023-01-28T08:51:23Z)
Parameter-Efficient Fine-Tuning Design Spaces [63.954953653386106]
We present a parameter-efficient fine-tuning design paradigm and discover design patterns that are applicable to different experimental settings. We show experimentally that these methods consistently and significantly outperform investigated parameter-efficient fine-tuning strategies across different backbone models and different tasks in natural language processing.
arXiv Detail & Related papers (2023-01-04T21:00:18Z)
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning. We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z)
Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration [32.055812915031666]
We show how to compute optimal parameter portfolios of a given size. We extend this benchmark by analyzing optimal control policies that can select the parameters only from a given portfolio of possible values. We demonstrate the usefulness of our benchmarks by analyzing the behavior of the DDQN reinforcement learning approach for dynamic algorithm configuration.
arXiv Detail & Related papers (2022-02-07T15:00:30Z)
Additive Tree-Structured Conditional Parameter Spaces in Bayesian Optimization: A Novel Covariance Function and a Fast Implementation [34.89735938765757]
We generalize the additive assumption to tree-structured functions, showing improved sample-efficiency, wider applicability and greater flexibility. By incorporating the structure information of parameter spaces and the additive assumption in the BO loop, we develop a parallel algorithm to optimize the acquisition function. We demonstrate our method on an optimization benchmark function, on pruning pre-trained VGG16 and Res50 models as well as on searching activation functions of ResNet20.
arXiv Detail & Related papers (2020-10-06T16:08:58Z)
Additive Tree-Structured Covariance Function for Conditional Parameter Spaces in Bayesian Optimization [34.89735938765757]
We generalize the additive assumption to tree-structured functions. By incorporating the structure information of parameter spaces and the additive assumption in the BO loop, we develop a parallel algorithm to optimize the acquisition function.
arXiv Detail & Related papers (2020-06-21T11:21:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.