Related papers: FedYolo: Augmenting Federated Learning with Pretrained Transformers

FedYolo: Augmenting Federated Learning with Pretrained Transformers

URL: http://arxiv.org/abs/2307.04905v1
Date: Mon, 10 Jul 2023 21:08:52 GMT
Title: FedYolo: Augmenting Federated Learning with Pretrained Transformers
Authors: Xuechen Zhang, Mingchen Li, Xiangyu Chang, Jiasi Chen, Amit K. Roy-Chowdhury, Ananda Theertha Suresh, Samet Oymak
Abstract summary: In this work, we investigate pretrained transformers (PTF) to achieve on-device learning goals. We show that larger scale shrinks the accuracy gaps between alternative approaches and improves robustness. Finally, it enables clients to solve multiple unrelated tasks simultaneously using a single PTF.
Score: 61.56476056444933
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The growth and diversity of machine learning applications motivate a rethinking of learning with mobile and edge devices. How can we address diverse client goals and learn with scarce heterogeneous data? While federated learning aims to address these issues, it has challenges hindering a unified solution. Large transformer models have been shown to work across a variety of tasks achieving remarkable few-shot adaptation. This raises the question: Can clients use a single general-purpose model, rather than custom models for each task, while obeying device and network constraints? In this work, we investigate pretrained transformers (PTF) to achieve these on-device learning goals and thoroughly explore the roles of model size and modularity, where the latter refers to adaptation through modules such as prompts or adapters. Focusing on federated learning, we demonstrate that: (1) Larger scale shrinks the accuracy gaps between alternative approaches and improves heterogeneity robustness. Scale allows clients to run more local SGD epochs which can significantly reduce the number of communication rounds. At the extreme, clients can achieve respectable accuracy locally highlighting the potential of fully-local learning. (2) Modularity, by design, enables $>$100$\times$ less communication in bits. Surprisingly, it also boosts the generalization capability of local adaptation methods and the robustness of smaller PTFs. Finally, it enables clients to solve multiple unrelated tasks simultaneously using a single PTF, whereas full updates are prone to catastrophic forgetting. These insights on scale and modularity motivate a new federated learning approach we call "You Only Load Once" (FedYolo): The clients load a full PTF model once and all future updates are accomplished through communication-efficient modules with limited catastrophic-forgetting, where each task is assigned to its own module.

Related papers

Dynamic Allocation Hypernetwork with Adaptive Model Recalibration for Federated Continual Learning [49.508844889242425]
We propose a novel server-side FCL pattern in medical domain, Dynamic Allocation Hypernetwork with adaptive model recalibration (FedDAH) FedDAH is designed to facilitate collaborative learning under the distinct and dynamic task streams across clients. For the biased optimization, we introduce a novel adaptive model recalibration (AMR) to incorporate the candidate changes of historical models into current server updates.
arXiv Detail & Related papers (2025-03-25T00:17:47Z)
Collaborative and Efficient Personalization with Mixtures of Adaptors [5.195669033269619]
We propose a parameter-efficient framework to tackle multi-task learning problems. We call our framework Federated Low-Rank Adaptive Learning (FLoRAL) We show promising experimental results on synthetic datasets and real-world federated multi-task problems.
arXiv Detail & Related papers (2024-10-04T15:11:15Z)
SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models [71.78800549517298]
Continual learning (CL) ability is vital for deploying large language models (LLMs) in the dynamic world. Existing methods devise the learning module to acquire task-specific knowledge with parameter-efficient tuning (PET) block and the selection module to pick out the corresponding one for the testing input. We propose a novel Shared Attention Framework (SAPT) to align the PET learning and selection via the Shared Attentive Learning & Selection module.
arXiv Detail & Related papers (2024-01-16T11:45:03Z)
FedBone: Towards Large-Scale Federated Multi-Task Learning [13.835972363413884]
In real-world applications, visual and natural language tasks typically require large-scale models to extract high-level abstract features. Existing HFML methods disregard the impact of gradient conflicts on multi-task optimization. We propose an innovative framework called FedBone, which enables the construction of large-scale models with better generalization.
arXiv Detail & Related papers (2023-06-30T08:19:38Z)
eP-ALM: Efficient Perceptual Augmentation of Language Models [70.47962271121389]
We propose to direct effort to efficient adaptations of existing models, and propose to augment Language Models with perception. Existing approaches for adapting pretrained models for vision-language tasks still rely on several key components that hinder their efficiency. We show that by freezing more than 99% of total parameters, training only one linear projection layer, and prepending only one trainable token, our approach (dubbed eP-ALM) significantly outperforms other baselines on VQA and Captioning.
arXiv Detail & Related papers (2023-03-20T19:20:34Z)
Meta Knowledge Condensation for Federated Learning [65.20774786251683]
Existing federated learning paradigms usually extensively exchange distributed models at a central solver to achieve a more powerful model. This would incur severe communication burden between a server and multiple clients especially when data distributions are heterogeneous. Unlike existing paradigms, we introduce an alternative perspective to significantly decrease the communication cost in federate learning.
arXiv Detail & Related papers (2022-09-29T15:07:37Z)
No One Left Behind: Inclusive Federated Learning over Heterogeneous Devices [79.16481453598266]
We propose InclusiveFL, a client-inclusive federated learning method to handle this problem. The core idea of InclusiveFL is to assign models of different sizes to clients with different computing capabilities. We also propose an effective method to share the knowledge among multiple local models with different sizes.
arXiv Detail & Related papers (2022-02-16T13:03:27Z)
Comfetch: Federated Learning of Large Networks on Constrained Clients via Sketching [28.990067638230254]
Federated learning (FL) is a popular paradigm for private and collaborative model training on the edge. We propose a novel algorithm, Comdirectional, which allows clients to train large networks using representations of the global neural network.
arXiv Detail & Related papers (2021-09-17T04:48:42Z)
Federated Few-Shot Learning with Adversarial Learning [30.905239262227]
We propose a few-shot learning framework to learn a few-shot classification model that can classify unseen data classes with only a few labeled samples. We show our approaches outperform baselines by more than 10% in learning vision tasks and 5% in language tasks.
arXiv Detail & Related papers (2021-04-01T09:44:57Z)
UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers [108.92194081987967]
We make the first attempt to explore a universal multi-agent reinforcement learning pipeline, designing one single architecture to fit tasks. Unlike previous RNN-based models, we utilize a transformer-based model to generate a flexible policy. The proposed model, named as Universal Policy Decoupling Transformer (UPDeT), further relaxes the action restriction and makes the multi-agent task's decision process more explainable.
arXiv Detail & Related papers (2021-01-20T07:24:24Z)
Federated Generative Adversarial Learning [13.543039993168735]
Generative adversarial networks (GANs) have achieved advancement in various real-world applications. GANs are suffering from data limitation problems in real cases. We propose a novel generative learning scheme utilizing a federated learning framework.
arXiv Detail & Related papers (2020-05-07T23:06:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.