Federated Hyperparameter Tuning: Challenges, Baselines, and Connections
to Weight-Sharing
- URL: http://arxiv.org/abs/2106.04502v1
- Date: Tue, 8 Jun 2021 16:42:37 GMT
- Title: Federated Hyperparameter Tuning: Challenges, Baselines, and Connections
to Weight-Sharing
- Authors: Mikhail Khodak, Renbo Tu, Tian Li, Liam Li, Maria-Florina Balcan,
Virginia Smith, Ameet Talwalkar
- Abstract summary: We show how standard approaches may be adapted to form baselines for the federated setting.
By making a novel connection to the neural architecture search technique of weight-sharing, we introduce a new method, FedEx.
Theoretically, we show that a FedEx variant correctly tunes the on-device learning rate in the setting of online convex optimization.
- Score: 37.056834089598105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tuning hyperparameters is a crucial but arduous part of the machine learning
pipeline. Hyperparameter optimization is even more challenging in federated
learning, where models are learned over a distributed network of heterogeneous
devices; here, the need to keep data on device and perform local training makes
it difficult to efficiently train and evaluate configurations. In this work, we
investigate the problem of federated hyperparameter tuning. We first identify
key challenges and show how standard approaches may be adapted to form
baselines for the federated setting. Then, by making a novel connection to the
neural architecture search technique of weight-sharing, we introduce a new
method, FedEx, to accelerate federated hyperparameter tuning that is applicable
to widely-used federated optimization methods such as FedAvg and recent
variants. Theoretically, we show that a FedEx variant correctly tunes the
on-device learning rate in the setting of online convex optimization across
devices. Empirically, we show that FedEx can outperform natural baselines for
federated hyperparameter tuning by several percentage points on the
Shakespeare, FEMNIST, and CIFAR-10 benchmarks, obtaining higher accuracy using
the same training budget.
Related papers
- Efficient Asynchronous Federated Learning with Sparsification and
Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data.
FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training.
We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z) - FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine
Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation
Models with Mobile Edge Computing [20.06372852684181]
We introduce Emulator-Assisted Tuning and Federated PEAT (FedPEAT)
FedPEAT uses adapters, emulators, and PEFT for federated model tuning, enhancing model privacy and memory efficiency.
We tested FedPEAT in a unique scenario with a server participating in collaborative tuning.
arXiv Detail & Related papers (2023-10-26T15:47:44Z) - Profit: Benchmarking Personalization and Robustness Trade-off in
Federated Prompt Tuning [40.16581292336117]
In many applications of federated learning (FL), clients desire models that are personalized using their local data, yet are also robust in the sense that they retain general global knowledge.
It is critical to understand how to navigate this personalization vs robustness trade-off when designing federated systems.
arXiv Detail & Related papers (2023-10-06T23:46:33Z) - HyperTuner: A Cross-Layer Multi-Objective Hyperparameter Auto-Tuning
Framework for Data Analytic Services [25.889791254011794]
We propose HyperTuner to execute cross-layer multi-objective hyperparameter auto-tuning.
We show that HyperTuner is superior in both convergence and diversity compared with the other four baseline algorithms.
experiments with different training datasets, different optimization objectives and different machine learning platforms verify that HyperTuner can well adapt to various data analytic service scenarios.
arXiv Detail & Related papers (2023-04-20T02:19:10Z) - Federated Hypergradient Descent [0.0]
We apply a principled approach on a method for adaptive client learning rate, number of local steps, and batch size.
In our federated learning applications, our primary motivations are minimizing communication budget as well as local computational resources in the training pipeline.
We show our numerical results through extensive empirical experiments with the Federated EMNIST-62 (FEMNIST) and Federated Stack Overflow (FSO) datasets.
arXiv Detail & Related papers (2022-11-03T19:22:00Z) - Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors.
In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori.
Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for
Hyperparameter Recommendation [83.85021205445662]
We propose an instantiation--amortized auto-tuning (AT2) to speed up tuning of machine learning models.
We conduct a thorough analysis of the multi-task multi-fidelity Bayesian optimization framework, which leads to the best instantiation--amortized auto-tuning (AT2)
arXiv Detail & Related papers (2021-06-17T00:01:18Z) - Robust Federated Learning Through Representation Matching and Adaptive
Hyper-parameters [5.319361976450981]
Federated learning is a distributed, privacy-aware learning scenario which trains a single model on data belonging to several clients.
Current federated learning methods struggle in cases with heterogeneous client-side data distributions.
We propose a novel representation matching scheme that reduces the divergence of local models.
arXiv Detail & Related papers (2019-12-30T20:19:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.