Related papers: FlexTrain: A Dynamic Training Framework for Heterogeneous Devices Environments

FlexTrain: A Dynamic Training Framework for Heterogeneous Devices Environments

URL: http://arxiv.org/abs/2310.20457v2
Date: Thu, 23 Nov 2023 09:58:55 GMT
Title: FlexTrain: A Dynamic Training Framework for Heterogeneous Devices Environments
Authors: Mert Unsal, Ali Maatouk, Antonio De Domenico, Nicola Piovesan, Fadhel Ayed
Abstract summary: FlexTrain is a framework that accommodates the diverse storage and computational resources available on different devices during the training phase. We demonstrate the effectiveness of FlexTrain on the CIFAR-100 dataset, where a single global model trained with FlexTrain can be easily deployed on heterogeneous devices.
Score: 12.165263783903216
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As deep learning models become increasingly large, they pose significant challenges in heterogeneous devices environments. The size of deep learning models makes it difficult to deploy them on low-power or resource-constrained devices, leading to long inference times and high energy consumption. To address these challenges, we propose FlexTrain, a framework that accommodates the diverse storage and computational resources available on different devices during the training phase. FlexTrain enables efficient deployment of deep learning models, while respecting device constraints, minimizing communication costs, and ensuring seamless integration with diverse devices. We demonstrate the effectiveness of FlexTrain on the CIFAR-100 dataset, where a single global model trained with FlexTrain can be easily deployed on heterogeneous devices, saving training time and energy consumption. We also extend FlexTrain to the federated learning setting, showing that our approach outperforms standard federated learning benchmarks on both CIFAR-10 and CIFAR-100 datasets.

Related papers

A Robust Federated Learning Framework for Undependable Devices at Scale [24.28558003071587]
In a federated learning system, many devices, such as smartphones, are often undependable (e.g., frequently disconnected from WiFi) during training. Existing FL frameworks always assume a dependable environment and exclude undependable devices from training. We propose FLUDE to effectively deal with undependable environments.
arXiv Detail & Related papers (2024-12-28T03:28:52Z)
Learn More by Using Less: Distributed Learning with Energy-Constrained Devices [3.730504020733928]
Federated Learning (FL) has emerged as a solution for distributed model training across decentralized, privacy-preserving devices. We propose LeanFed, an energy-aware FL framework designed to optimize client selection and training workloads on battery-constrained devices.
arXiv Detail & Related papers (2024-12-03T09:06:57Z)
Flextron: Many-in-One Flexible Large Language Model [85.93260172698398]
We introduce Flextron, a network architecture and post-training model optimization framework supporting flexible model deployment. We present a sample-efficient training method and associated routing algorithms for transforming an existing trained LLM into a Flextron model. We demonstrate superior performance over multiple end-to-end trained variants and other state-of-the-art elastic networks, all with a single pretraining run that consumes a mere 7.63% tokens compared to original pretraining.
arXiv Detail & Related papers (2024-06-11T01:16:10Z)
Efficient Asynchronous Federated Learning with Sparsification and Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data. FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training. We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z)
Speed Up Federated Learning in Heterogeneous Environment: A Dynamic Tiering Approach [5.504000607257414]
Federated learning (FL) enables collaboratively training a model while keeping the training data decentralized and private. One significant impediment to training a model using FL, especially large models, is the resource constraints of devices with heterogeneous computation and communication capacities as well as varying task sizes. We propose the Dynamic Tiering-based Federated Learning (DTFL) system where slower clients dynamically offload part of the model to the server to alleviate resource constraints and speed up training.
arXiv Detail & Related papers (2023-12-09T19:09:19Z)
FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement [19.639936387834677]
Mixture-of-Experts (MoEs) are becoming more popular and have demonstrated impressive pretraining scalability in various downstream tasks. MoEs are becoming a new data analytics paradigm in the data life cycle and suffering from unique challenges at scales, complexities, and granularities never before possible. In this paper, we propose a novel DNN training framework, FlexMoE, which systematically and transparently address the inefficiency caused by dynamic dataflow.
arXiv Detail & Related papers (2023-04-08T07:34:26Z)
FedDCT: Federated Learning of Large Convolutional Neural Networks on Resource Constrained Devices using Divide and Collaborative Training [13.072061144206097]
We introduce FedDCT, a novel distributed learning paradigm that enables the usage of large, high-performance CNNs on resource-limited edge devices. We empirically conduct experiments on standardized datasets, including CIFAR-10, CIFAR-100, and two real-world medical datasets HAM10000 and VAIPE. Compared to other existing approaches, FedDCT achieves higher accuracy and substantially reduces the number of communication rounds.
arXiv Detail & Related papers (2022-11-20T11:11:56Z)
ZeroFL: Efficient On-Device Training for Federated Learning with Local Sparsity [15.908499928588297]
In Federated Learning (FL), nodes are orders of magnitude more constrained than traditional server-grade hardware. We propose ZeroFL, a framework that relies on highly sparse operations to accelerate on-device training.
arXiv Detail & Related papers (2022-08-04T07:37:07Z)
Efficient Split-Mix Federated Learning for On-Demand and In-Situ Customization [107.72786199113183]
Federated learning (FL) provides a distributed learning framework for multiple participants to collaborate learning without sharing raw data. In this paper, we propose a novel Split-Mix FL strategy for heterogeneous participants that, once training is done, provides in-situ customization of model sizes and robustness.
arXiv Detail & Related papers (2022-03-18T04:58:34Z)
Federated Dynamic Sparse Training: Computing Less, Communicating Less, Yet Learning Better [88.28293442298015]
Federated learning (FL) enables distribution of machine learning workloads from the cloud to resource-limited edge devices. We develop, implement, and experimentally validate a novel FL framework termed Federated Dynamic Sparse Training (FedDST) FedDST is a dynamic process that extracts and trains sparse sub-networks from the target full network.
arXiv Detail & Related papers (2021-12-18T02:26:38Z)
Efficient Device Scheduling with Multi-Job Federated Learning [64.21733164243781]
We propose a novel multi-job Federated Learning framework to enable the parallel training process of multiple jobs. We propose a reinforcement learning-based method and a Bayesian optimization-based method to schedule devices for multiple jobs while minimizing the cost. Our proposed approaches significantly outperform baseline approaches in terms of training time (up to 8.67 times faster) and accuracy (up to 44.6% higher)
arXiv Detail & Related papers (2021-12-11T08:05:11Z)
To Talk or to Work: Flexible Communication Compression for Energy Efficient Federated Learning over Heterogeneous Mobile Edge Devices [78.38046945665538]
federated learning (FL) over massive mobile edge devices opens new horizons for numerous intelligent mobile applications. FL imposes huge communication and computation burdens on participating devices due to periodical global synchronization and continuous local training. We develop a convergence-guaranteed FL algorithm enabling flexible communication compression.
arXiv Detail & Related papers (2020-12-22T02:54:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.