Related papers: AET-SGD: Asynchronous Event-triggered Stochastic Gradient Descent

AET-SGD: Asynchronous Event-triggered Stochastic Gradient Descent

URL: http://arxiv.org/abs/2112.13935v1
Date: Mon, 27 Dec 2021 23:20:04 GMT
Title: AET-SGD: Asynchronous Event-triggered Stochastic Gradient Descent
Authors: Nhuong Nguyen, Song Han
Abstract summary: Communication cost is the main bottleneck for the design of effective distributed learning algorithms. We propose a Asynchronous Event-triggered Gradient Descent (SGD) framework, called AET-SGD, to reduce the communication cost among the compute nodes. We show that AET-SGD can resist large delay from the straggler nodes while obtaining a decent performance and a desired speedup ratio.
Score: 10.029039979947798
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Communication cost is the main bottleneck for the design of effective distributed learning algorithms. Recently, event-triggered techniques have been proposed to reduce the exchanged information among compute nodes and thus alleviate the communication cost. However, most existing event-triggered approaches only consider heuristic event-triggered thresholds. They also ignore the impact of computation and network delay, which play an important role on the training performance. In this paper, we propose an Asynchronous Event-triggered Stochastic Gradient Descent (SGD) framework, called AET-SGD, to i) reduce the communication cost among the compute nodes, and ii) mitigate the impact of the delay. Compared with baseline event-triggered methods, AET-SGD employs a linear increasing sample size event-triggered threshold, and can significantly reduce the communication cost while keeping good convergence performance. We implement AET-SGD and evaluate its performance on multiple representative data sets, including MNIST, FashionMNIST, KMNIST and CIFAR10. The experimental results validate the correctness of the design and show a significant communication cost reduction from 44x to 120x, compared to the state of the art. Our results also show that AET-SGD can resist large delay from the straggler nodes while obtaining a decent performance and a desired speedup ratio.

Related papers

The Impact Analysis of Delays in Asynchronous Federated Learning with Data Heterogeneity for Edge Intelligence [10.54196990763149]
Federated learning (FL) has provided a new methodology for coordinating a group of clients to train a machine learning model collaboratively. This paper examines the impact of unknown causes of delay on training performance in an Asynchronous Federated Learning (AFL) system with data heterogeneity.
arXiv Detail & Related papers (2025-03-06T03:10:49Z)
Communication-Efficient Federated Learning by Quantized Variance Reduction for Heterogeneous Wireless Edge Networks [55.467288506826755]
Federated learning (FL) has been recognized as a viable solution for local-privacy-aware collaborative model training in wireless edge networks. Most existing communication-efficient FL algorithms fail to reduce the significant inter-device variance. We propose a novel communication-efficient FL algorithm, named FedQVR, which relies on a sophisticated variance-reduced scheme.
arXiv Detail & Related papers (2025-01-20T04:26:21Z)
Towards Effective, Efficient and Unsupervised Social Event Detection in the Hyperbolic Space [54.936897625837474]
This work introduces an unsupervised framework, HyperSED (Hyperbolic SED). Specifically, the framework first models social messages into semantic-based message anchors, and then leverages the structure of the anchor graph. Experiments on public datasets demonstrate HyperSED's competitive performance, along with a substantial improvement in efficiency.
arXiv Detail & Related papers (2024-12-14T06:55:27Z)
Towards Resource-Efficient Federated Learning in Industrial IoT for Multivariate Time Series Analysis [50.18156030818883]
Anomaly and missing data constitute a thorny problem in industrial applications. Deep learning enabled anomaly detection has emerged as a critical direction. The data collected in edge devices contain user privacy.
arXiv Detail & Related papers (2024-11-06T15:38:31Z)
ORCHID: Streaming Threat Detection over Versioned Provenance Graphs [11.783370157959968]
We present ORCHID, a novel Prov-IDS that performs fine-grained detection of process-level threats over a real time event stream. ORCHID takes advantage of the unique immutable properties of a versioned provenance graphs to iteratively embed the entire graph in a sequential RNN model. We evaluate ORCHID on four public datasets, including DARPA TC, to show that ORCHID can provide competitive classification performance.
arXiv Detail & Related papers (2024-08-23T19:44:40Z)
TCGPN: Temporal-Correlation Graph Pre-trained Network for Stock Forecasting [1.864621482724548]
We propose a novel approach called the Temporal-Correlation Graph Pre-trained Network (TCGPN) to address these limitations. TCGPN utilize Temporal-correlation fusion encoder to get a mixed representation and pre-training method with carefully designed temporal and correlation pre-training tasks. Experiments are conducted on real stock market data sets CSI300 and CSI500 that exhibit minimal periodicity.
arXiv Detail & Related papers (2024-07-26T05:27:26Z)
Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework [56.82432591933544]
Distributed gradient descent (SGD) has attracted considerable recent attention due to its potential for scaling computational resources, reducing training time, and helping protect user privacy in machine learning. This paper presents the run time and staleness of distributed SGD based on delay differential equations (SDDEs) and the approximation of gradient arrivals. It is interestingly shown that increasing the number of activated workers does not necessarily accelerate distributed SGD due to staleness.
arXiv Detail & Related papers (2024-06-17T02:56:55Z)
PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly Detection [65.24854366973794]
Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in domains such as medicine, social networks, and e-commerce. We introduce a simple method termed PREprocessing and Matching (PREM for short) to improve the efficiency of GAD. Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities.
arXiv Detail & Related papers (2023-10-18T02:59:57Z)
Semantic Communication Enabling Robust Edge Intelligence for Time-Critical IoT Applications [87.05763097471487]
This paper aims to design robust Edge Intelligence using semantic communication for time-critical IoT applications. We analyze the effect of image DCT coefficients on inference accuracy and propose the channel-agnostic effectiveness encoding for offloading.
arXiv Detail & Related papers (2022-11-24T20:13:17Z)
OFedQIT: Communication-Efficient Online Federated Learning via Quantization and Intermittent Transmission [7.6058140480517356]
Online federated learning (OFL) is a promising framework to collaboratively learn a sequence of non-linear functions (or models) from distributed streaming data. We propose a communication-efficient OFL algorithm (named OFedQIT) by means of a quantization and an intermittent transmission. Our analysis reveals that OFedQIT successfully addresses the drawbacks of OFedAvg while maintaining superior learning accuracy.
arXiv Detail & Related papers (2022-05-13T07:46:43Z)
AEGNN: Asynchronous Event-based Graph Neural Networks [54.528926463775946]
Event-based Graph Neural Networks generalize standard GNNs to process events as "evolving"-temporal graphs. AEGNNs are easily trained on synchronous inputs and can be converted to efficient, "asynchronous" networks at test time.
arXiv Detail & Related papers (2022-03-31T16:21:12Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
FedAT: A High-Performance and Communication-Efficient Federated Learning System with Asynchronous Tiers [22.59875034596411]
We present FedAT, a novel Federated learning method with Asynchronous Tiers under Non-i.i.d. data. FedAT minimizes the straggler effect with improved convergence speed and test accuracy. Results show that FedAT improves the prediction performance by up to 21.09%, and reduces the communication cost by up to 8.5x, compared to state-of-the-art FL methods.
arXiv Detail & Related papers (2020-10-12T18:38:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.