Federated Fine-Tuning of Foundation Models via Probabilistic Masking
- URL: http://arxiv.org/abs/2311.17299v1
- Date: Wed, 29 Nov 2023 01:10:39 GMT
- Title: Federated Fine-Tuning of Foundation Models via Probabilistic Masking
- Authors: Vasileios Tsouvalas, Yuki Asano, Aaqib Saeed
- Abstract summary: Foundation Models (FMs) have revolutionized machine learning with their adaptability and high performance across tasks.
Their integration into Federated Learning (FL) is challenging due to substantial communication overhead from their extensive parameterization.
We present DeltaMask, a novel method that efficiently fine-tunes FMs in FL at an ultra-low, well below 1 bpp.
- Score: 11.192113661738764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Foundation Models (FMs) have revolutionized machine learning with their
adaptability and high performance across tasks; yet, their integration into
Federated Learning (FL) is challenging due to substantial communication
overhead from their extensive parameterization. Current communication-efficient
FL strategies, such as gradient compression, reduce bitrates to around $1$
bit-per-parameter (bpp). However, these approaches fail to harness the
characteristics of FMs, with their large number of parameters still posing a
challenge to communication efficiency, even at these bitrate regimes. In this
work, we present DeltaMask, a novel method that efficiently fine-tunes FMs in
FL at an ultra-low bitrate, well below 1 bpp. DeltaMask employs stochastic
masking to detect highly effective subnetworks within FMs and leverage
stochasticity and sparsity in client masks to compress updates into a compact
grayscale image using probabilistic filters, deviating from traditional weight
training approaches. Our comprehensive evaluations across various datasets and
architectures demonstrate DeltaMask efficiently achieves bitrates as low as
0.09 bpp, enhancing communication efficiency while maintaining FMs performance,
as measured on 8 datasets and 5 pre-trained models of various network
architectures.
Related papers
- FedPFT: Federated Proxy Fine-Tuning of Foundation Models [55.58899993272904]
Adapting Foundation Models (FMs) for downstream tasks through Federated Learning (FL) emerges as a promising strategy for protecting data privacy and valuable FMs.
Existing methods fine-tune FM by allocating sub-FM to clients in FL, leading to suboptimal performance due to insufficient tuning and inevitable error accumulations of gradients.
We propose Federated Proxy Fine-Tuning (FedPFT), a novel method enhancing FMs adaptation in downstream tasks through FL by two key modules.
arXiv Detail & Related papers (2024-04-17T16:30:06Z) - Parametric Feature Transfer: One-shot Federated Learning with Foundation
Models [14.97955440815159]
In one-shot federated learning, clients collaboratively train a global model in a single round of communication.
This paper introduces FedPFT, a methodology that harnesses the transferability of foundation models to enhance both accuracy and communication efficiency in one-shot FL.
arXiv Detail & Related papers (2024-02-02T19:34:46Z) - A Masked Pruning Approach for Dimensionality Reduction in
Communication-Efficient Federated Learning Systems [11.639503711252663]
Federated Learning (FL) represents a growing machine learning (ML) paradigm designed for training models across numerous nodes.
We develop a novel algorithm that overcomes limitations by combining a pruning-based method with the FL process.
We present an extensive experimental study demonstrating the superior performance of MPFL compared to existing methods.
arXiv Detail & Related papers (2023-12-06T20:29:23Z) - Bridging the Gap Between Foundation Models and Heterogeneous Federated
Learning [9.198799314774437]
Federated learning (FL) offers privacy-preserving decentralized machine learning, optimizing models at edge clients without sharing private data.
Foundation models (FMs) have gained traction in the artificial intelligence (AI) community due to their exceptional performance across various tasks.
We present an adaptive framework for Resource-aware Federated Foundation Models (RaFFM) to address these challenges.
arXiv Detail & Related papers (2023-09-30T04:31:53Z) - Communication-Efficient Federated Learning via Regularized Sparse Random
Networks [21.491346993533572]
This work presents a new method for enhancing communication efficiency in Federated Learning.
In this setting, a binary mask is optimized instead of the model weights, which are kept fixed.
S sparse binary masks are exchanged rather than the floating point weights in traditional federated learning.
arXiv Detail & Related papers (2023-09-19T14:05:12Z) - Hierarchical Over-the-Air FedGradNorm [50.756991828015316]
Multi-task learning (MTL) is a learning paradigm to learn multiple related tasks simultaneously with a single shared network.
We propose hierarchical over-the-air (HOTA) PFL with a dynamic weighting strategy which we call HOTA-FedGradNorm.
arXiv Detail & Related papers (2022-12-14T18:54:46Z) - HFedMS: Heterogeneous Federated Learning with Memorable Data Semantics
in Industrial Metaverse [49.1501082763252]
This paper presents HFEDMS for incorporating practical FL into the emerging Industrial Metaverse.
It reduces data heterogeneity through dynamic grouping and training mode conversion.
Then, it compensates for the forgotten knowledge by fusing compressed historical data semantics.
Experiments have been conducted on the streamed non-i.i.d. FEMNIST dataset using 368 simulated devices.
arXiv Detail & Related papers (2022-11-07T04:33:24Z) - Collaborative Intelligent Reflecting Surface Networks with Multi-Agent
Reinforcement Learning [63.83425382922157]
Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks.
In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting.
arXiv Detail & Related papers (2022-03-26T20:37:14Z) - Federated Dynamic Sparse Training: Computing Less, Communicating Less,
Yet Learning Better [88.28293442298015]
Federated learning (FL) enables distribution of machine learning workloads from the cloud to resource-limited edge devices.
We develop, implement, and experimentally validate a novel FL framework termed Federated Dynamic Sparse Training (FedDST)
FedDST is a dynamic process that extracts and trains sparse sub-networks from the target full network.
arXiv Detail & Related papers (2021-12-18T02:26:38Z) - Low-Latency Federated Learning over Wireless Channels with Differential
Privacy [142.5983499872664]
In federated learning (FL), model training is distributed over clients and local models are aggregated by a central server.
In this paper, we aim to minimize FL training delay over wireless channels, constrained by overall training performance as well as each client's differential privacy (DP) requirement.
arXiv Detail & Related papers (2021-06-20T13:51:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.