Related papers: FedMef: Towards Memory-efficient Federated Dynamic Pruning

FedMef: Towards Memory-efficient Federated Dynamic Pruning

URL: http://arxiv.org/abs/2403.14737v1
Date: Thu, 21 Mar 2024 13:54:36 GMT
Title: FedMef: Towards Memory-efficient Federated Dynamic Pruning
Authors: Hong Huang, Weiming Zhuang, Chen Chen, Lingjuan Lyu,
Abstract summary: Federated learning (FL) promotes decentralized training while prioritizing data confidentiality. Its application on resource-constrained devices is challenging due to the high demand for computation and memory resources to train deep learning models. We propose FedMef, a novel and memory-efficient federated dynamic pruning framework.
Score: 42.07105095641134
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated learning (FL) promotes decentralized training while prioritizing data confidentiality. However, its application on resource-constrained devices is challenging due to the high demand for computation and memory resources to train deep learning models. Neural network pruning techniques, such as dynamic pruning, could enhance model efficiency, but directly adopting them in FL still poses substantial challenges, including post-pruning performance degradation, high activation memory usage, etc. To address these challenges, we propose FedMef, a novel and memory-efficient federated dynamic pruning framework. FedMef comprises two key components. First, we introduce the budget-aware extrusion that maintains pruning efficiency while preserving post-pruning performance by salvaging crucial information from parameters marked for pruning within a given budget. Second, we propose scaled activation pruning to effectively reduce activation memory footprints, which is particularly beneficial for deploying FL to memory-limited devices. Extensive experiments demonstrate the effectiveness of our proposed FedMef. In particular, it achieves a significant reduction of 28.5% in memory footprint compared to state-of-the-art methods while obtaining superior accuracy.

Related papers

DAF: An Efficient End-to-End Dynamic Activation Framework for on-Device DNN Training [41.09085549544767]
We introduce a Dynamic Activation Framework (DAF) that enables scalable and efficient on-device training through system-level optimizations.<n>DAF achieves both memory- and time-efficient dynamic quantization training by addressing key system bottlenecks.<n> Evaluations on various deep learning models across embedded and mobile platforms demonstrate up to a $22.9times$ reduction in memory usage and a $3.2times$ speedup.
arXiv Detail & Related papers (2025-07-09T08:59:30Z)
FedPaI: Achieving Extreme Sparsity in Federated Learning via Pruning at Initialization [10.425903190996785]
Federated Learning (FL) enables distributed training on edge devices. Current iterative pruning techniques improve communication efficiency but are limited by their centralized design. We propose FedPaI, a novel efficient FL framework that leverages Pruning at Initialization (PaI) to achieve extreme sparsity.
arXiv Detail & Related papers (2025-04-01T00:24:34Z)
SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity [30.260783715373382]
Test-time adaptation (TTA) has emerged to improve the performance of deep models by adapting them to unlabeled target data online. Yet, the significant memory cost, particularly in resource-constrained terminals, impedes the effective deployment of most backward-propagation-based TTA methods. To tackle memory constraints, we introduce SURGEON, a method that substantially reduces memory cost while preserving comparable accuracy improvements.
arXiv Detail & Related papers (2025-03-26T09:27:09Z)
Efficient Federated Fine-Tuning of Large Language Models with Layer Dropout [15.009864792277236]
Fine-tuning plays a crucial role in enabling pre-trained LLMs to evolve from general language comprehension to task-specific expertise. This work proposes DropPEFT, an innovative federated PEFT framework that employs a novel transformer dropout method. We show that DropPEFT can achieve a 1.3-6.3times speedup in model convergence and a 40%-67% reduction in memory footprint.
arXiv Detail & Related papers (2025-03-13T09:59:16Z)
Memory-Free and Parallel Computation for Quantized Spiking Neural Networks [12.227968342252026]
Quantized Spiking Neural Networks (QSNNs) offer superior energy efficiency and are well-suited for deployment on resource-limited edge devices. limited bit-width weight and membrane potential result in a notable performance decline. We introduce a memory-free quantization method that captures all historical information without directly storing membrane potentials.
arXiv Detail & Related papers (2025-02-25T10:34:25Z)
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios. In the early route, intermediate outputs are consolidated via an anti-redundancy operation. In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z)
When Foresight Pruning Meets Zeroth-Order Optimization: Efficient Federated Learning for Low-Memory Devices [36.23767349592602]
Federated Learning (FL) enables collaborative learning in Artificial Intelligence of Things (AIoT) design. FL fails to work on low-memory AIoT devices due to its heavy memory usage. We propose a federated foresight pruning method based on Neural Tangent Kernel (NTK), which can seamlessly integrate with federated BP-Free training frameworks.
arXiv Detail & Related papers (2024-05-08T02:24:09Z)
Contractive error feedback for gradient compression [60.05809370598166]
We propose a communication efficient method called contractive error feedback (ConEF) As opposed to SGD with error-feedback (EFSGD) that inefficiently manages memory, ConEF obtains the sweet spot of convergence and memory usage. We empirically validate ConEF on various learning tasks that include image classification, language modeling, and machine translation.
arXiv Detail & Related papers (2023-12-13T21:54:21Z)
MF-NeRF: Memory Efficient NeRF with Mixed-Feature Hash Table [62.164549651134465]
We propose MF-NeRF, a memory-efficient NeRF framework that employs a Mixed-Feature hash table to improve memory efficiency and reduce training time while maintaining reconstruction quality. Our experiments with state-of-the-art Instant-NGP, TensoRF, and DVGO, indicate our MF-NeRF could achieve the fastest training time on the same GPU hardware with similar or even higher reconstruction quality.
arXiv Detail & Related papers (2023-04-25T05:44:50Z)
Distributed Pruning Towards Tiny Neural Networks in Federated Learning [12.63559789381064]
FedTiny is a distributed pruning framework for federated learning. It generates specialized tiny models for memory- and computing-constrained devices. It achieves an accuracy improvement of 2.61% while significantly reducing the computational cost by 95.91%.
arXiv Detail & Related papers (2022-12-05T01:58:45Z)
Self-Attentive Pooling for Efficient Deep Learning [6.822466048176652]
We propose a novel non-local self-attentive pooling method that can be used as a drop-in replacement to the standard pooling layers. We surpass the test accuracy of existing pooling techniques on different variants of MobileNet-V2 on ImageNet by an average of 1.2%. Our approach achieves 1.43% higher test accuracy compared to SOTA techniques with iso-memory footprints.
arXiv Detail & Related papers (2022-09-16T00:35:14Z)
FedDUAP: Federated Learning with Dynamic Update and Adaptive Pruning Using Shared Data on the Server [64.94942635929284]
Federated Learning (FL) suffers from two critical challenges, i.e., limited computational resources and low training efficiency. We propose a novel FL framework, FedDUAP, to exploit the insensitive data on the server and the decentralized data in edge devices. By integrating the two original techniques together, our proposed FL model, FedDUAP, significantly outperforms baseline approaches in terms of accuracy (up to 4.8% higher), efficiency (up to 2.8 times faster), and computational cost (up to 61.9% smaller)
arXiv Detail & Related papers (2022-04-25T10:00:00Z)
Fast and Memory-Efficient Network Towards Efficient Image Super-Resolution [44.909233016062906]
We build a memory-efficient image super-resolution network (FMEN) for resource-constrained devices. FMEN runs 33% faster and reduces 74% memory consumption compared with the state-of-the-art EISR model: E-RFDN. FMEN-S achieves the lowest memory consumption and the second shortest runtime in NTIRE 2022 challenge on efficient super-resolution.
arXiv Detail & Related papers (2022-04-18T16:49:20Z)
Mesa: A Memory-saving Training Framework for Transformers [58.78933015299703]
We present Mesa, a memory-saving training framework for Transformers. Mesa uses exact activations during forward pass while storing a low-precision version of activations to reduce memory consumption during training. Experiments on ImageNet, CIFAR-100 and ADE20K demonstrate that Mesa can reduce half of the memory footprints during training.
arXiv Detail & Related papers (2021-11-22T11:23:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.