Related papers: Efficient Federated Finetuning of Tiny Transformers with Resource-Constrained Devices

Efficient Federated Finetuning of Tiny Transformers with Resource-Constrained Devices

URL: http://arxiv.org/abs/2411.07826v1
Date: Tue, 12 Nov 2024 14:22:16 GMT
Title: Efficient Federated Finetuning of Tiny Transformers with Resource-Constrained Devices
Authors: Kilian Pfeiffer, Mohamed Aboelenien Ahmed, Ramin Khalili, Jörg Henkel,
Abstract summary: Large Language Models (LLMs) require massive amounts of data for training and induce high resource requirements. To fine-tune such a model in a parameter-efficient way, techniques like Adapter or LoRA have been developed. We show that our presented scheme outperforms the current state of the art when dealing with homogeneous or heterogeneous computation and memory constraints.
Score: 10.676390348161888
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, Large Language Models (LLMs) through Transformer structures have dominated many machine learning tasks, especially text processing. However, these models require massive amounts of data for training and induce high resource requirements, particularly in terms of the large number of Floating Point Operations (FLOPs) and the high amounts of memory needed. To fine-tune such a model in a parameter-efficient way, techniques like Adapter or LoRA have been developed. However, we observe that the application of LoRA, when used in federated learning (FL), while still being parameter-efficient, is memory and FLOP inefficient. Based on that observation, we develop a novel layer finetuning scheme that allows devices in cross-device FL to make use of pretrained neural networks (NNs) while adhering to given resource constraints. We show that our presented scheme outperforms the current state of the art when dealing with homogeneous or heterogeneous computation and memory constraints and is on par with LoRA regarding limited communication, thereby achieving significantly higher accuracies in FL training.

Related papers

Communication-Efficient Wireless Federated Fine-Tuning for Large-Scale AI Models [13.742950928229078]
Low-Rank Adaptation (LoRA) addresses these issues by training compact, low-rank matrices instead of fully fine-tuning large models. This paper introduces a wireless federated LoRA fine-tuning framework that optimize both learning performance and communication efficiency.
arXiv Detail & Related papers (2025-05-01T06:15:38Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models [83.77114091471822]
Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML) A challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming. This is particularly pronounced for word embedding parameters in large language models (LLMs), which are crucial for language understanding. A physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks.
arXiv Detail & Related papers (2024-07-16T12:21:29Z)
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios. In the early route, intermediate outputs are consolidated via an anti-redundancy operation. In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z)
Save It All: Enabling Full Parameter Tuning for Federated Large Language Models via Cycle Block Gradient Descent [15.463595798992621]
Large language models (LLMs) have revolutionized the deep learning paradigm, yielding impressive results across a wide array of tasks. Existing solutions make the unrealistic assumption that the entire model is exchanged for training. We introduce a novel method for the efficient training and fine-tuning of LLMs in FL, with minimal resource consumption.
arXiv Detail & Related papers (2024-06-17T03:49:44Z)
SpaFL: Communication-Efficient Federated Learning with Sparse Models and Low computational Overhead [75.87007729801304]
SpaFL: a communication-efficient FL framework is proposed to optimize sparse model structures with low computational overhead. Experiments show that SpaFL improves accuracy while requiring much less communication and computing resources compared to sparse baselines.
arXiv Detail & Related papers (2024-06-01T13:10:35Z)
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning [29.798780069556074]
Federated learning (FL) has emerged as a promising approach to collaboratively train machine learning models across multiple edge devices. We propose OnDev-LCT: Lightweight Convolutional Transformers for On-Device vision tasks with limited training data and resources.
arXiv Detail & Related papers (2024-01-22T02:17:36Z)
A Masked Pruning Approach for Dimensionality Reduction in Communication-Efficient Federated Learning Systems [11.639503711252663]
Federated Learning (FL) represents a growing machine learning (ML) paradigm designed for training models across numerous nodes. We develop a novel algorithm that overcomes limitations by combining a pruning-based method with the FL process. We present an extensive experimental study demonstrating the superior performance of MPFL compared to existing methods.
arXiv Detail & Related papers (2023-12-06T20:29:23Z)
Adaptive Model Pruning and Personalization for Federated Learning over Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy. We consider a FL framework with partial model pruning and personalization to overcome these challenges. This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z)
SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models [28.764782216513037]
Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning. We propose a method called SLoRA, which overcomes the key limitations of LoRA in high heterogeneous data scenarios. Our experimental results demonstrate that SLoRA achieves performance comparable to full fine-tuning.
arXiv Detail & Related papers (2023-08-12T10:33:57Z)
Aggregating Capacity in FL through Successive Layer Training for Computationally-Constrained Devices [3.4530027457862]
Federated learning (FL) is usually performed on resource-constrained edge devices. FL training process should be adjusted to such constraints. We propose a new method that enables successive freezing and training of the parameters of the FL model at devices.
arXiv Detail & Related papers (2023-05-26T15:04:06Z)
Automated Federated Learning in Mobile Edge Networks -- Fast Adaptation and Convergence [83.58839320635956]
Federated Learning (FL) can be used in mobile edge networks to train machine learning models in a distributed manner. Recent FL has been interpreted within a Model-Agnostic Meta-Learning (MAML) framework, which brings FL significant advantages in fast adaptation and convergence over heterogeneous datasets. This paper addresses how much benefit MAML brings to FL and how to maximize such benefit over mobile edge networks.
arXiv Detail & Related papers (2023-03-23T02:42:10Z)
Performance Optimization for Variable Bitwidth Federated Learning in Wireless Networks [103.22651843174471]
This paper considers improving wireless communication and computation efficiency in federated learning (FL) via model quantization. In the proposed bitwidth FL scheme, edge devices train and transmit quantized versions of their local FL model parameters to a coordinating server, which aggregates them into a quantized global model and synchronizes the devices. We show that the FL training process can be described as a Markov decision process and propose a model-based reinforcement learning (RL) method to optimize action selection over iterations.
arXiv Detail & Related papers (2022-09-21T08:52:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.