Riemannian Low-Rank Model Compression for Federated Learning with
Over-the-Air Aggregation
- URL: http://arxiv.org/abs/2306.02433v1
- Date: Sun, 4 Jun 2023 18:32:50 GMT
- Title: Riemannian Low-Rank Model Compression for Federated Learning with
Over-the-Air Aggregation
- Authors: Ye Xue, Vincent Lau
- Abstract summary: Low-rank model compression is a widely used technique for reducing the computational load when training machine learning models.
Existing compression techniques are not directly applicable to efficient over-the-air (OTA) aggregation in federated learning systems.
We propose a novel manifold optimization formulation for low-rank model compression in FL that does not relax the low-rank constraint.
- Score: 2.741266294612776
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Low-rank model compression is a widely used technique for reducing the
computational load when training machine learning models. However, existing
methods often rely on relaxing the low-rank constraint of the model weights
using a regularized nuclear norm penalty, which requires an appropriate
hyperparameter that can be difficult to determine in practice. Furthermore,
existing compression techniques are not directly applicable to efficient
over-the-air (OTA) aggregation in federated learning (FL) systems for
distributed Internet-of-Things (IoT) scenarios. In this paper, we propose a
novel manifold optimization formulation for low-rank model compression in FL
that does not relax the low-rank constraint. Our optimization is conducted
directly over the low-rank manifold, guaranteeing that the model is exactly
low-rank. We also introduce a consensus penalty in the optimization formulation
to support OTA aggregation. Based on our optimization formulation, we propose
an alternating Riemannian optimization algorithm with a precoder that enables
efficient OTA aggregation of low-rank local models without sacrificing training
performance. Additionally, we provide convergence analysis in terms of key
system parameters and conduct extensive experiments with real-world datasets to
demonstrate the effectiveness of our proposed Riemannian low-rank model
compression scheme compared to various state-of-the-art baselines.
Related papers
- LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method that effectively adapts large pre-trained models for downstream tasks.
We propose a novel approach that employs a low rank tensor parametrization for model updates.
Our method is both efficient and effective for fine-tuning large language models, achieving a substantial reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation [52.6922833948127]
In this work, we investigate the importance of parameters in pre-trained diffusion models.
We propose a novel model fine-tuning method to make full use of these ineffective parameters.
Our method enhances the generative capabilities of pre-trained models in downstream applications.
arXiv Detail & Related papers (2024-09-10T16:44:47Z) - Edge-Efficient Deep Learning Models for Automatic Modulation Classification: A Performance Analysis [0.7428236410246183]
We investigate optimized convolutional neural networks (CNNs) developed for automatic modulation classification (AMC) of wireless signals.
We propose optimized models with the combinations of these techniques to fuse the complementary optimization benefits.
The experimental results show that the proposed individual and combined optimization techniques are highly effective for developing models with significantly less complexity.
arXiv Detail & Related papers (2024-04-11T06:08:23Z) - Rethinking Compression: Reduced Order Modelling of Latent Features in
Large Language Models [9.91972450276408]
This paper introduces an innovative approach for the parametric and practical compression of Large Language Models (LLMs) based on reduced order modelling.
Our method represents a significant advancement in model compression by leveraging matrix decomposition, demonstrating superior efficacy compared to the prevailing state-of-the-art structured pruning method.
arXiv Detail & Related papers (2023-12-12T07:56:57Z) - Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared
Pre-trained Language Models [109.06052781040916]
We introduce a technique to enhance the inference efficiency of parameter-shared language models.
We also propose a simple pre-training technique that leads to fully or partially shared models.
Results demonstrate the effectiveness of our methods on both autoregressive and autoencoding PLMs.
arXiv Detail & Related papers (2023-10-19T15:13:58Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Learning Accurate Performance Predictors for Ultrafast Automated Model
Compression [86.22294249097203]
We propose an ultrafast automated model compression framework called SeerNet for flexible network deployment.
Our method achieves competitive accuracy-complexity trade-offs with significant reduction of the search cost.
arXiv Detail & Related papers (2023-04-13T10:52:49Z) - Train Flat, Then Compress: Sharpness-Aware Minimization Learns More
Compressible Models [7.6356407698088]
Pruning unnecessary parameters has emerged as a simple and effective method for compressing large models.
We show that optimizing for flat minima consistently leads to greater compressibility of parameters compared to standard Adam optimization.
arXiv Detail & Related papers (2022-05-25T11:54:37Z) - Compression-aware Training of Neural Networks using Frank-Wolfe [27.69586583737247]
We propose a framework that encourages convergence to well-performing solutions while inducing robustness towards filter pruning and low-rank matrix decomposition.
Our method is able to outperform existing compression-aware approaches and, in the case of low-rank matrix decomposition, it also requires significantly less computational resources than approaches based on nuclear-norm regularization.
arXiv Detail & Related papers (2022-05-24T09:29:02Z) - Communication-Compressed Adaptive Gradient Method for Distributed
Nonconvex Optimization [21.81192774458227]
One of the major bottlenecks is the large communication cost between the central server and the local workers.
Our proposed distributed learning framework features an effective gradient gradient compression strategy.
arXiv Detail & Related papers (2021-11-01T04:54:55Z) - Learnable Bernoulli Dropout for Bayesian Deep Learning [53.79615543862426]
Learnable Bernoulli dropout (LBD) is a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters.
LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation.
arXiv Detail & Related papers (2020-02-12T18:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.