c-TPE: Tree-structured Parzen Estimator with Inequality Constraints for
Expensive Hyperparameter Optimization
- URL: http://arxiv.org/abs/2211.14411v4
- Date: Fri, 26 May 2023 09:44:26 GMT
- Title: c-TPE: Tree-structured Parzen Estimator with Inequality Constraints for
Expensive Hyperparameter Optimization
- Authors: Shuhei Watanabe, Frank Hutter
- Abstract summary: We propose constrained TPE (c-TPE), an extension of the widely-used versatile Bayesian optimization method, tree-structured Parzen estimator (TPE) to handle these constraints.
Our proposed extension goes beyond a simple combination of an existing acquisition function and the original TPE, and instead includes modifications that address issues that cause poor performance.
In the experiments, we demonstrate that c-TPE exhibits the best average rank performance among existing methods with statistical significance on 81 expensive HPO with inequality constraints.
- Score: 45.67326752241075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hyperparameter optimization (HPO) is crucial for strong performance of deep
learning algorithms and real-world applications often impose some constraints,
such as memory usage, or latency on top of the performance requirement. In this
work, we propose constrained TPE (c-TPE), an extension of the widely-used
versatile Bayesian optimization method, tree-structured Parzen estimator (TPE),
to handle these constraints. Our proposed extension goes beyond a simple
combination of an existing acquisition function and the original TPE, and
instead includes modifications that address issues that cause poor performance.
We thoroughly analyze these modifications both empirically and theoretically,
providing insights into how they effectively overcome these challenges. In the
experiments, we demonstrate that c-TPE exhibits the best average rank
performance among existing methods with statistical significance on 81
expensive HPO with inequality constraints. Due to the lack of baselines, we
only discuss the applicability of our method to hard-constrained optimization
in Appendix D.
Related papers
- ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihood [14.512464277772194]
Aligned Supervised Fine-Tuning (ASFT) is an effective approach that better aligns Large Language Models with pair-wise datasets.
ASFT mitigates the issue where the DPO loss function decreases the probability of generating human-dispreferred data.
Extensive experiments demonstrate that ASFT is an effective alignment approach, consistently outperforming existing methods.
arXiv Detail & Related papers (2024-09-14T11:39:13Z) - Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective [125.00228936051657]
We introduce NTK-CL, a novel framework that eliminates task-specific parameter storage while adaptively generating task-relevant features.
By fine-tuning optimizable parameters with appropriate regularization, NTK-CL achieves state-of-the-art performance on established PEFT-CL benchmarks.
arXiv Detail & Related papers (2024-07-24T09:30:04Z) - See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition [56.87609859444084]
parameter-efficient fine-tuning (PEFT) focuses on optimizing a select subset of parameters while keeping the rest fixed, significantly lowering computational and storage overheads.
We take the first step to unify all approaches by dissecting them from a decomposition perspective.
We introduce two novel PEFT methods alongside a simple yet effective framework designed to enhance the performance of PEFT techniques across various applications.
arXiv Detail & Related papers (2024-07-07T15:44:42Z) - End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared
Pre-trained Language Models [109.06052781040916]
We introduce a technique to enhance the inference efficiency of parameter-shared language models.
We also propose a simple pre-training technique that leads to fully or partially shared models.
Results demonstrate the effectiveness of our methods on both autoregressive and autoencoding PLMs.
arXiv Detail & Related papers (2023-10-19T15:13:58Z) - Speeding Up Multi-Objective Hyperparameter Optimization by Task
Similarity-Based Meta-Learning for the Tree-Structured Parzen Estimator [37.553558410770314]
In this paper, we extend TPE's acquisition function to the meta-learning setting using a task similarity defined by the overlap of top domains between tasks.
In the experiments, we demonstrate that our method speeds up MO-TPE on tabular HPO benchmarks and attains state-of-the-art performance.
arXiv Detail & Related papers (2022-12-13T17:33:02Z) - ACE: Adaptive Constraint-aware Early Stopping in Hyperparameter
Optimization [18.81207777891714]
We propose an Adaptive Constraint-aware Early stopping (ACE) method to incorporate constraint evaluation into trial pruning during HPO.
To minimize the overall optimization cost, ACE estimates the cost-effective constraint evaluation interval based on a theoretical analysis of the expected evaluation cost.
arXiv Detail & Related papers (2022-08-04T22:56:16Z) - Towards Deployment-Efficient Reinforcement Learning: Lower Bound and
Optimality [141.89413461337324]
Deployment efficiency is an important criterion for many real-world applications of reinforcement learning (RL)
We propose a theoretical formulation for deployment-efficient RL (DE-RL) from an "optimization with constraints" perspective.
arXiv Detail & Related papers (2022-02-14T01:31:46Z) - Scalable Constrained Bayesian Optimization [10.820024633762596]
The global optimization of a high-dimensional black-box function under black-box constraints is a pervasive task in machine learning, control, and the scientific community.
We propose the scalable constrained Bayesian optimization (SCBO) algorithm that overcomes the above challenges and pushes the state-the-art.
arXiv Detail & Related papers (2020-02-20T01:48:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.