Achieving Efficient Distributed Machine Learning Using a Novel
Non-Linear Class of Aggregation Functions
- URL: http://arxiv.org/abs/2201.12488v1
- Date: Sat, 29 Jan 2022 03:43:26 GMT
- Title: Achieving Efficient Distributed Machine Learning Using a Novel
Non-Linear Class of Aggregation Functions
- Authors: Haizhou Du, Ryan Yang, Yijian Chen, Qiao Xiang, Andre Wibisono, Wei
Huang
- Abstract summary: Distributed machine learning (DML) over time-varying networks can be an enabler for emerging decentralized ML applications.
We propose a novel non-linear class of model aggregation functions to achieve efficient DML over time-varying networks.
- Score: 9.689867512720083
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Distributed machine learning (DML) over time-varying networks can be an
enabler for emerging decentralized ML applications such as autonomous driving
and drone fleeting. However, the commonly used weighted arithmetic mean model
aggregation function in existing DML systems can result in high model loss, low
model accuracy, and slow convergence speed over time-varying networks. To
address this issue, in this paper, we propose a novel non-linear class of model
aggregation functions to achieve efficient DML over time-varying networks.
Instead of taking a linear aggregation of neighboring models as most existing
studies do, our mechanism uses a nonlinear aggregation, a weighted power-p mean
(WPM) where p is a positive odd integer, as the aggregation function of local
models from neighbors. The subsequent optimizing steps are taken using mirror
descent defined by a Bregman divergence that maintains convergence to
optimality. In this paper, we analyze properties of the WPM and rigorously
prove convergence properties of our aggregation mechanism. Additionally,
through extensive experiments, we show that when p > 1, our design
significantly improves the convergence speed of the model and the scalability
of DML under time-varying networks compared with arithmetic mean aggregation
functions, with little additional 26computation overhead.
Related papers
- Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Diffusion-Model-Assisted Supervised Learning of Generative Models for
Density Estimation [10.793646707711442]
We present a framework for training generative models for density estimation.
We use the score-based diffusion model to generate labeled data.
Once the labeled data are generated, we can train a simple fully connected neural network to learn the generative model in the supervised manner.
arXiv Detail & Related papers (2023-10-22T23:56:19Z) - Perceiver-based CDF Modeling for Time Series Forecasting [25.26713741799865]
We propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data.
Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction.
Experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods.
arXiv Detail & Related papers (2023-10-03T01:13:17Z) - Efficient Interpretable Nonlinear Modeling for Multiple Time Series [5.448070998907116]
This paper proposes an efficient nonlinear modeling approach for multiple time series.
It incorporates nonlinear interactions among different time-series variables.
Experimental results show that the proposed algorithm improves the identification of the support of the VAR coefficients in a parsimonious manner.
arXiv Detail & Related papers (2023-09-29T11:42:59Z) - Generalised Latent Assimilation in Heterogeneous Reduced Spaces with
Machine Learning Surrogate Models [10.410970649045943]
We develop a system which combines reduced-order surrogate models with a novel data assimilation technique.
Generalised Latent Assimilation can benefit both the efficiency provided by the reduced-order modelling and the accuracy of data assimilation.
arXiv Detail & Related papers (2022-04-07T15:13:12Z) - Parallel Successive Learning for Dynamic Distributed Model Training over
Heterogeneous Wireless Networks [50.68446003616802]
Federated learning (FedL) has emerged as a popular technique for distributing model training over a set of wireless devices.
We develop parallel successive learning (PSL), which expands the FedL architecture along three dimensions.
Our analysis sheds light on the notion of cold vs. warmed up models, and model inertia in distributed machine learning.
arXiv Detail & Related papers (2022-02-07T05:11:01Z) - Dynamic Gaussian Mixture based Deep Generative Model For Robust
Forecasting on Sparse Multivariate Time Series [43.86737761236125]
We propose a novel generative model, which tracks the transition of latent clusters, instead of isolated feature representations.
It is characterized by a newly designed dynamic Gaussian mixture distribution, which captures the dynamics of clustering structures.
A structured inference network is also designed for enabling inductive analysis.
arXiv Detail & Related papers (2021-03-03T04:10:07Z) - Optimization-driven Machine Learning for Intelligent Reflecting Surfaces
Assisted Wireless Networks [82.33619654835348]
Intelligent surface (IRS) has been employed to reshape the wireless channels by controlling individual scattering elements' phase shifts.
Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity.
In this article, we focus on machine learning (ML) approaches for performance in IRS-assisted wireless networks.
arXiv Detail & Related papers (2020-08-29T08:39:43Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - A Generative Learning Approach for Spatio-temporal Modeling in Connected
Vehicular Network [55.852401381113786]
This paper proposes LaMI (Latency Model Inpainting), a novel framework to generate a comprehensive-temporal quality framework for wireless access latency of connected vehicles.
LaMI adopts the idea from image inpainting and synthesizing and can reconstruct the missing latency samples by a two-step procedure.
In particular, it first discovers the spatial correlation between samples collected in various regions using a patching-based approach and then feeds the original and highly correlated samples into a Varienational Autocoder (VAE)
arXiv Detail & Related papers (2020-03-16T03:43:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.