Related papers: MFES-HB: Efficient Hyperband with Multi-Fidelity Quality Measurements

MFES-HB: Efficient Hyperband with Multi-Fidelity Quality Measurements

URL: http://arxiv.org/abs/2012.03011v1
Date: Sat, 5 Dec 2020 11:51:15 GMT
Title: MFES-HB: Efficient Hyperband with Multi-Fidelity Quality Measurements
Authors: Yang Li, Yu Shen, Jiawei Jiang, Jinyang Gao, Ce Zhang, Bin Cui
Abstract summary: We present MFES-HB, an efficient Hyperband method that is capable of utilizing both the high-fidelity and low-fidelity measurements. We show that MFES-HB can achieve 3.3-8.9x speedups over the state-of-the-art approach - BOHB.
Score: 34.75195640330286
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hyperparameter optimization (HPO) is a fundamental problem in automatic machine learning (AutoML). However, due to the expensive evaluation cost of models (e.g., training deep learning models or training models on large datasets), vanilla Bayesian optimization (BO) is typically computationally infeasible. To alleviate this issue, Hyperband (HB) utilizes the early stopping mechanism to speed up configuration evaluations by terminating those badly-performing configurations in advance. This leads to two kinds of quality measurements: (1) many low-fidelity measurements for configurations that get early-stopped, and (2) few high-fidelity measurements for configurations that are evaluated without being early stopped. The state-of-the-art HB-style method, BOHB, aims to combine the benefits of both BO and HB. Instead of sampling configurations randomly in HB, BOHB samples configurations based on a BO surrogate model, which is constructed with the high-fidelity measurements only. However, the scarcity of high-fidelity measurements greatly hampers the efficiency of BO to guide the configuration search. In this paper, we present MFES-HB, an efficient Hyperband method that is capable of utilizing both the high-fidelity and low-fidelity measurements to accelerate the convergence of HPO tasks. Designing MFES-HB is not trivial as the low-fidelity measurements can be biased yet informative to guide the configuration search. Thus we propose to build a Multi- Fidelity Ensemble Surrogate (MFES) based on the generalized Product of Experts framework, which can integrate useful information from multi-fidelity measurements effectively. The empirical studies on the real-world AutoML tasks demonstrate that MFES-HB can achieve 3.3-8.9x speedups over the state-of-the-art approach - BOHB.

Related papers

MoA: Heterogeneous Mixture of Adapters for Parameter-Efficient Fine-Tuning of Large Language Models [61.89384981175277]
We propose a emphheterogeneous textbfMixture-of-Adapters (MoA) approach to integrate Low-Rank Adaptation (LoRA) and Mixture-of-Experts (MoE)<n> Experimental results demonstrate that heterogeneous MoA outperforms homogeneous MoE-LoRA methods in both performance and parameter efficiency.
arXiv Detail & Related papers (2025-06-06T09:54:19Z)
ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning [50.53705050673944]
We propose ULTHO, an ultra-lightweight yet powerful framework for fast HPO in deep RL within single runs. Specifically, we formulate the HPO process as a multi-armed bandit with clustered arms (MABC) and link it directly to long-term return optimization. We test ULTHO on benchmarks including ALE, Procgen, MiniGrid, and PyBullet.
arXiv Detail & Related papers (2025-03-08T07:03:43Z)
Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback [64.67540769692074]
Large language models (LLMs) fine-tuned with alignment techniques, such as reinforcement learning from human feedback, have been instrumental in developing some of the most capable AI systems to date. We introduce an approach called Margin Matching Preference Optimization (MMPO), which incorporates relative quality margins into optimization, leading to improved LLM policies and reward models. Experiments with both human and AI feedback data demonstrate that MMPO consistently outperforms baseline methods, often by a substantial margin, on popular benchmarks including MT-bench and RewardBench.
arXiv Detail & Related papers (2024-10-04T04:56:11Z)
FastBO: Fast HPO and NAS with Adaptive Fidelity Identification [29.594900930334216]
We propose a multi-fidelity BO method named FastBO, which adaptively decides the fidelity for each configuration and efficiently offers strong performance. We also show that our adaptive fidelity identification strategy provides a way to extend any single-fidelity method to the multi-fidelity setting.
arXiv Detail & Related papers (2024-09-01T02:40:04Z)
FlexHB: a More Efficient and Flexible Framework for Hyperparameter Optimization [4.127081624438282]
We propose FlexHB, a new method pushing multi-fidelity BO to the limit and re-designing a framework for early stopping with Successive Halving(SH) Our method achieves superior efficiency and outperforms other methods on various HPO tasks.
arXiv Detail & Related papers (2024-02-21T09:18:59Z)
Semi-Federated Learning: Convergence Analysis and Optimization of A Hybrid Learning Framework [70.83511997272457]
We propose a semi-federated learning (SemiFL) paradigm to leverage both the base station (BS) and devices for a hybrid implementation of centralized learning (CL) and FL. We propose a two-stage algorithm to solve this intractable problem, in which we provide the closed-form solutions to the beamformers.
arXiv Detail & Related papers (2023-10-04T03:32:39Z)
Is One Epoch All You Need For Multi-Fidelity Hyperparameter Optimization? [17.21160278797221]
Multi-fidelity HPO (MF-HPO) leverages intermediate accuracy levels in the learning process and discards low-performing models early on. We compared various representative MF-HPO methods against a simple baseline on classical benchmark data. This baseline achieved similar results to its counterparts, while requiring an order of magnitude less computation.
arXiv Detail & Related papers (2023-07-28T09:14:41Z)
An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models [55.14405248920852]
We conduct experiments with prefix tuning, prompt tuning, and adapter tuning on different language models and bias types to evaluate their debiasing performance. We find that the parameter-efficient methods are effective in mitigating gender bias, where adapter tuning is consistently the most effective. We also find that prompt tuning is more suitable for GPT-2 than BERT, and racial and religious bias is less effective when it comes to racial and religious bias.
arXiv Detail & Related papers (2023-06-06T23:56:18Z)
Federated Multi-Sequence Stochastic Approximation with Local Hypergradient Estimation [28.83712379658548]
We develop FedMSA, the first federated approximation algorithm for multiple coupled sequences (MSA) FedMSA enables the provable estimation of hypergradients in BLO and MCO via local client updates. We provide experiments that support our theory and demonstrate the empirical benefits of FedMSA.
arXiv Detail & Related papers (2023-06-02T16:17:43Z)
FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization [50.12374973760274]
We propose and implement a benchmark suite FedHPO-B that incorporates comprehensive FL tasks, enables efficient function evaluations, and eases continuing extensions. We also conduct extensive experiments based on FedHPO-B to benchmark a few HPO methods.
arXiv Detail & Related papers (2022-06-08T15:29:10Z)
Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios. We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z)
Learning Based Hybrid Beamforming for Millimeter Wave Multi-User MIMO Systems [22.478350298755892]
We propose an extreme learning machine (ELM) framework to jointly optimize transmitting and receiving beamformers. Both FP-MM-HBF and ELM-HBF can provide higher system sum-rate compared with existing methods.
arXiv Detail & Related papers (2020-04-27T16:31:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.