Related papers: Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing

Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing

URL: http://arxiv.org/abs/2106.04502v1
Date: Tue, 8 Jun 2021 16:42:37 GMT
Title: Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing
Authors: Mikhail Khodak, Renbo Tu, Tian Li, Liam Li, Maria-Florina Balcan, Virginia Smith, Ameet Talwalkar
Abstract summary: We show how standard approaches may be adapted to form baselines for the federated setting. By making a novel connection to the neural architecture search technique of weight-sharing, we introduce a new method, FedEx. Theoretically, we show that a FedEx variant correctly tunes the on-device learning rate in the setting of online convex optimization.
Score: 37.056834089598105
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tuning hyperparameters is a crucial but arduous part of the machine learning pipeline. Hyperparameter optimization is even more challenging in federated learning, where models are learned over a distributed network of heterogeneous devices; here, the need to keep data on device and perform local training makes it difficult to efficiently train and evaluate configurations. In this work, we investigate the problem of federated hyperparameter tuning. We first identify key challenges and show how standard approaches may be adapted to form baselines for the federated setting. Then, by making a novel connection to the neural architecture search technique of weight-sharing, we introduce a new method, FedEx, to accelerate federated hyperparameter tuning that is applicable to widely-used federated optimization methods such as FedAvg and recent variants. Theoretically, we show that a FedEx variant correctly tunes the on-device learning rate in the setting of online convex optimization across devices. Empirically, we show that FedEx can outperform natural baselines for federated hyperparameter tuning by several percentage points on the Shakespeare, FEMNIST, and CIFAR-10 benchmarks, obtaining higher accuracy using the same training budget.

Related papers

Efficient Asynchronous Federated Learning with Sparsification and Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data. FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training. We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z)
FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation Models with Mobile Edge Computing [20.06372852684181]
We introduce Emulator-Assisted Tuning and Federated PEAT (FedPEAT) FedPEAT uses adapters, emulators, and PEFT for federated model tuning, enhancing model privacy and memory efficiency. We tested FedPEAT in a unique scenario with a server participating in collaborative tuning.
arXiv Detail & Related papers (2023-10-26T15:47:44Z)
Profit: Benchmarking Personalization and Robustness Trade-off in Federated Prompt Tuning [40.16581292336117]
In many applications of federated learning (FL), clients desire models that are personalized using their local data, yet are also robust in the sense that they retain general global knowledge. It is critical to understand how to navigate this personalization vs robustness trade-off when designing federated systems.
arXiv Detail & Related papers (2023-10-06T23:46:33Z)
HyperTuner: A Cross-Layer Multi-Objective Hyperparameter Auto-Tuning Framework for Data Analytic Services [25.889791254011794]
We propose HyperTuner to execute cross-layer multi-objective hyperparameter auto-tuning. We show that HyperTuner is superior in both convergence and diversity compared with the other four baseline algorithms. experiments with different training datasets, different optimization objectives and different machine learning platforms verify that HyperTuner can well adapt to various data analytic service scenarios.
arXiv Detail & Related papers (2023-04-20T02:19:10Z)
Federated Hypergradient Descent [0.0]
We apply a principled approach on a method for adaptive client learning rate, number of local steps, and batch size. In our federated learning applications, our primary motivations are minimizing communication budget as well as local computational resources in the training pipeline. We show our numerical results through extensive empirical experiments with the Federated EMNIST-62 (FEMNIST) and Federated Stack Overflow (FSO) datasets.
arXiv Detail & Related papers (2022-11-03T19:22:00Z)
Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors. In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori. Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z)
Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction. Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z)
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning. We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z)
Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for Hyperparameter Recommendation [83.85021205445662]
We propose an instantiation--amortized auto-tuning (AT2) to speed up tuning of machine learning models. We conduct a thorough analysis of the multi-task multi-fidelity Bayesian optimization framework, which leads to the best instantiation--amortized auto-tuning (AT2)
arXiv Detail & Related papers (2021-06-17T00:01:18Z)
Robust Federated Learning Through Representation Matching and Adaptive Hyper-parameters [5.319361976450981]
Federated learning is a distributed, privacy-aware learning scenario which trains a single model on data belonging to several clients. Current federated learning methods struggle in cases with heterogeneous client-side data distributions. We propose a novel representation matching scheme that reduces the divergence of local models.
arXiv Detail & Related papers (2019-12-30T20:19:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.