Mixed-Precision Quantization for Federated Learning on
Resource-Constrained Heterogeneous Devices
- URL: http://arxiv.org/abs/2311.18129v1
- Date: Wed, 29 Nov 2023 22:43:40 GMT
- Title: Mixed-Precision Quantization for Federated Learning on
Resource-Constrained Heterogeneous Devices
- Authors: Huancheng Chen and Haris Vikalo
- Abstract summary: We present a novel FL algorithm, FedMPQ, which introduces mixed-precision quantization to resource-heterogeneous FL systems.
Specifically, local models, quantized so as to satisfy bit-width constraint, are trained by optimizing an objective function.
To initialize the next round of local training, the server relies on the information learned in the previous training round to customize bit-width assignments of the models delivered to different clients.
- Score: 17.56259695496955
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While federated learning (FL) systems often utilize quantization to battle
communication and computational bottlenecks, they have heretofore been limited
to deploying fixed-precision quantization schemes. Meanwhile, the concept of
mixed-precision quantization (MPQ), where different layers of a deep learning
model are assigned varying bit-width, remains unexplored in the FL settings. We
present a novel FL algorithm, FedMPQ, which introduces mixed-precision
quantization to resource-heterogeneous FL systems. Specifically, local models,
quantized so as to satisfy bit-width constraint, are trained by optimizing an
objective function that includes a regularization term which promotes reduction
of precision in some of the layers without significant performance degradation.
The server collects local model updates, de-quantizes them into full-precision
models, and then aggregates them into a global model. To initialize the next
round of local training, the server relies on the information learned in the
previous training round to customize bit-width assignments of the models
delivered to different clients. In extensive benchmarking experiments on
several model architectures and different datasets in both iid and non-iid
settings, FedMPQ outperformed the baseline FL schemes that utilize
fixed-precision quantization while incurring only a minor computational
overhead on the participating devices.
Related papers
- Federated Bayesian Deep Learning: The Application of Statistical Aggregation Methods to Bayesian Models [0.9940108090221528]
Aggregation strategies have been developed to pool or fuse the weights and biases of distributed deterministic models.
We show that simple application of the aggregation methods associated with FL schemes for deterministic models is either impossible or results in sub-optimal performance.
arXiv Detail & Related papers (2024-03-22T15:02:24Z) - Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement
Learning [22.31766292657812]
Mixed-precision quantization mostly predetermines the model bit-width settings before actual training.
We propose a novel Data Quality-aware Mixed-precision Quantization framework, dubbed DQMQ, to dynamically adapt quantization bit-widths to different data qualities.
arXiv Detail & Related papers (2023-02-09T06:14:00Z) - Performance Optimization for Variable Bitwidth Federated Learning in
Wireless Networks [103.22651843174471]
This paper considers improving wireless communication and computation efficiency in federated learning (FL) via model quantization.
In the proposed bitwidth FL scheme, edge devices train and transmit quantized versions of their local FL model parameters to a coordinating server, which aggregates them into a quantized global model and synchronizes the devices.
We show that the FL training process can be described as a Markov decision process and propose a model-based reinforcement learning (RL) method to optimize action selection over iterations.
arXiv Detail & Related papers (2022-09-21T08:52:51Z) - Green, Quantized Federated Learning over Wireless Networks: An
Energy-Efficient Design [68.86220939532373]
The finite precision level is captured through the use of quantized neural networks (QNNs) that quantize weights and activations in fixed-precision format.
The proposed FL framework can reduce energy consumption until convergence by up to 70% compared to a baseline FL algorithm.
arXiv Detail & Related papers (2022-07-19T16:37:24Z) - Quantization Robust Federated Learning for Efficient Inference on
Heterogeneous Devices [18.1568276196989]
Federated Learning (FL) is a paradigm to distributively learn machine learning models from decentralized data that remains on-device.
We introduce multiple variants of federated averaging algorithm that train neural networks robust to quantization.
Our results demonstrate that integrating quantization robustness results in FL models that are significantly more robust to different bit-widths during quantized on-device inference.
arXiv Detail & Related papers (2022-06-22T05:11:44Z) - AMED: Automatic Mixed-Precision Quantization for Edge Devices [3.5223695602582614]
Quantized neural networks are well known for reducing the latency, power consumption, and model size without significant harm to the performance.
Mixed-precision quantization offers better utilization of customized hardware that supports arithmetic operations at different bitwidths.
arXiv Detail & Related papers (2022-05-30T21:23:22Z) - ClusterQ: Semantic Feature Distribution Alignment for Data-Free
Quantization [111.12063632743013]
We propose a new and effective data-free quantization method termed ClusterQ.
To obtain high inter-class separability of semantic features, we cluster and align the feature distribution statistics.
We also incorporate the intra-class variance to solve class-wise mode collapse.
arXiv Detail & Related papers (2022-04-30T06:58:56Z) - Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
Quantization Loss [61.26793005355441]
Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks.
DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons.
We experimentally validate our method on various benchmark datasets and network architectures.
arXiv Detail & Related papers (2021-09-05T15:15:07Z) - Adaptive Quantization of Model Updates for Communication-Efficient
Federated Learning [75.45968495410047]
Communication of model updates between client nodes and the central aggregating server is a major bottleneck in federated learning.
Gradient quantization is an effective way of reducing the number of bits required to communicate each model update.
We propose an adaptive quantization strategy called AdaFL that aims to achieve communication efficiency as well as a low error floor.
arXiv Detail & Related papers (2021-02-08T19:14:21Z) - UVeQFed: Universal Vector Quantization for Federated Learning [179.06583469293386]
Federated learning (FL) is an emerging approach to train such learning models without requiring the users to share their possibly private labeled data.
In FL, each user trains its copy of the learning model locally. The server then collects the individual updates and aggregates them into a global model.
We show that combining universal vector quantization methods with FL yields a decentralized training system in which the compression of the trained models induces only a minimum distortion.
arXiv Detail & Related papers (2020-06-05T07:10:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.