Training Mixed-Domain Translation Models via Federated Learning
- URL: http://arxiv.org/abs/2205.01557v1
- Date: Tue, 3 May 2022 15:16:51 GMT
- Title: Training Mixed-Domain Translation Models via Federated Learning
- Authors: Peyman Passban, Tanya Roosta, Rahul Gupta, Ankit Chadha, Clement Chung
- Abstract summary: In this work, we leverage federated learning (FL) in order to tackle the problem of training mixed-domain translation models.
With slight modifications in the training process, neural machine translation (NMT) engines can be easily adapted when an FL-based aggregation is applied to fuse different domains.
We propose a novel technique to dynamically control the communication bandwidth by selecting impactful parameters during FL updates.
- Score: 16.71888086947849
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training mixed-domain translation models is a complex task that demands
tailored architectures and costly data preparation techniques. In this work, we
leverage federated learning (FL) in order to tackle the problem. Our
investigation demonstrates that with slight modifications in the training
process, neural machine translation (NMT) engines can be easily adapted when an
FL-based aggregation is applied to fuse different domains. Experimental results
also show that engines built via FL are able to perform on par with
state-of-the-art baselines that rely on centralized training techniques. We
evaluate our hypothesis in the presence of five datasets with different sizes,
from different domains, to translate from German into English and discuss how
FL and NMT can mutually benefit from each other. In addition to providing
benchmarking results on the union of FL and NMT, we also propose a novel
technique to dynamically control the communication bandwidth by selecting
impactful parameters during FL updates. This is a significant achievement
considering the large size of NMT engines that need to be exchanged between FL
parties.
Related papers
- Can We Theoretically Quantify the Impacts of Local Updates on the Generalization Performance of Federated Learning? [50.03434441234569]
Federated Learning (FL) has gained significant popularity due to its effectiveness in training machine learning models across diverse sites without requiring direct data sharing.
While various algorithms have shown that FL with local updates is a communication-efficient distributed learning framework, the generalization performance of FL with local updates has received comparatively less attention.
arXiv Detail & Related papers (2024-09-05T19:00:18Z) - SpaFL: Communication-Efficient Federated Learning with Sparse Models and Low computational Overhead [75.87007729801304]
SpaFL: a communication-efficient FL framework is proposed to optimize sparse model structures with low computational overhead.
Experiments show that SpaFL improves accuracy while requiring much less communication and computing resources compared to sparse baselines.
arXiv Detail & Related papers (2024-06-01T13:10:35Z) - Only Send What You Need: Learning to Communicate Efficiently in
Federated Multilingual Machine Translation [19.28500206536013]
Federated learning (FL) is a promising approach for solving multilingual tasks.
We propose a meta-learning-based adaptive parameter selection methodology, MetaSend, that improves the communication efficiency of model transmissions.
We demonstrate that MetaSend obtains substantial improvements over baselines in translation quality in the presence of a limited communication budget.
arXiv Detail & Related papers (2024-01-15T04:04:26Z) - A Survey on Efficient Federated Learning Methods for Foundation Model Training [62.473245910234304]
Federated Learning (FL) has become an established technique to facilitate privacy-preserving collaborative training across a multitude of clients.
In the wake of Foundation Models (FM), the reality is different for many deep learning applications.
We discuss the benefits and drawbacks of parameter-efficient fine-tuning (PEFT) for FL applications.
arXiv Detail & Related papers (2024-01-09T10:22:23Z) - SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models [28.764782216513037]
Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning.
We propose a method called SLoRA, which overcomes the key limitations of LoRA in high heterogeneous data scenarios.
Our experimental results demonstrate that SLoRA achieves performance comparable to full fine-tuning.
arXiv Detail & Related papers (2023-08-12T10:33:57Z) - Automated Federated Learning in Mobile Edge Networks -- Fast Adaptation
and Convergence [83.58839320635956]
Federated Learning (FL) can be used in mobile edge networks to train machine learning models in a distributed manner.
Recent FL has been interpreted within a Model-Agnostic Meta-Learning (MAML) framework, which brings FL significant advantages in fast adaptation and convergence over heterogeneous datasets.
This paper addresses how much benefit MAML brings to FL and how to maximize such benefit over mobile edge networks.
arXiv Detail & Related papers (2023-03-23T02:42:10Z) - When Federated Learning Meets Pre-trained Language Models'
Parameter-Efficient Tuning Methods [22.16636947999123]
We introduce various parameter-efficient tuning (PETuning) methods into federated learning.
Specifically, we provide a holistic empirical study of representative PLMs tuning methods in FL.
Overall communication overhead can be significantly reduced by locally tuning and globally aggregating lightweight model parameters.
arXiv Detail & Related papers (2022-12-20T06:44:32Z) - Efficient Split-Mix Federated Learning for On-Demand and In-Situ
Customization [107.72786199113183]
Federated learning (FL) provides a distributed learning framework for multiple participants to collaborate learning without sharing raw data.
In this paper, we propose a novel Split-Mix FL strategy for heterogeneous participants that, once training is done, provides in-situ customization of model sizes and robustness.
arXiv Detail & Related papers (2022-03-18T04:58:34Z) - Communication-Efficient Federated Learning for Neural Machine
Translation [1.5362025549031046]
Training neural machine translation (NMT) models in federated learning (FL) settings could be inefficient both computationally and communication-wise.
In this paper, we explore how to efficiently build NMT models in an FL setup by proposing a novel solution.
In order to reduce the communication overhead, out of all neural layers we only exchange what we term "Controller" layers.
arXiv Detail & Related papers (2021-12-12T03:16:03Z) - Joint Superposition Coding and Training for Federated Learning over
Multi-Width Neural Networks [52.93232352968347]
This paper aims to integrate two synergetic technologies, federated learning (FL) and width-adjustable slimmable neural network (SNN)
FL preserves data privacy by exchanging the locally trained models of mobile devices. SNNs are however non-trivial, particularly under wireless connections with time-varying channel conditions.
We propose a communication and energy-efficient SNN-based FL (named SlimFL) that jointly utilizes superposition coding (SC) for global model aggregation and superposition training (ST) for updating local models.
arXiv Detail & Related papers (2021-12-05T11:17:17Z) - FedHe: Heterogeneous Models and Communication-Efficient Federated
Learning [0.0]
Federated learning (FL) is able to manage edge devices to cooperatively train a model while maintaining the training data local and private.
We propose a novel FL method, called FedHe, inspired by knowledge distillation, which can train heterogeneous models and support asynchronous training processes.
arXiv Detail & Related papers (2021-10-19T12:18:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.