Related papers: Failure-Resilient Distributed Inference with Model Compression over Heterogeneous Edge Devices

Failure-Resilient Distributed Inference with Model Compression over Heterogeneous Edge Devices

URL: http://arxiv.org/abs/2406.14185v1
Date: Thu, 20 Jun 2024 10:43:53 GMT
Title: Failure-Resilient Distributed Inference with Model Compression over Heterogeneous Edge Devices
Authors: Li Wang, Liang Li, Lianming Xu, Xian Peng, Aiguo Fei,
Abstract summary: We present RoCoIn, a robust cooperative inference mechanism for locally distributed execution of deep neural network-based inference tasks over heterogeneous edge devices. It creates a set of independent and compact student models that are learned from a large model using knowledge distillation for distributed deployment. In particular, the devices are strategically grouped to redundantly deploy and execute the same student model such that the inference process is resilient to any local failures.
Score: 9.423705897088672
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The distributed inference paradigm enables the computation workload to be distributed across multiple devices, facilitating the implementations of deep learning based intelligent services on extremely resource-constrained Internet of Things (IoT) scenarios. Yet it raises great challenges to perform complicated inference tasks relying on a cluster of IoT devices that are heterogeneous in their computing/communication capacity and prone to crash or timeout failures. In this paper, we present RoCoIn, a robust cooperative inference mechanism for locally distributed execution of deep neural network-based inference tasks over heterogeneous edge devices. It creates a set of independent and compact student models that are learned from a large model using knowledge distillation for distributed deployment. In particular, the devices are strategically grouped to redundantly deploy and execute the same student model such that the inference process is resilient to any local failures, while a joint knowledge partition and student model assignment scheme are designed to minimize the response latency of the distributed inference system in the presence of devices with diverse capacities. Extensive simulations are conducted to corroborate the superior performance of our RoCoIn for distributed inference compared to several baselines, and the results demonstrate its efficacy in timely inference and failure resiliency.

Related papers

Exploiting Edge Features for Transferable Adversarial Attacks in Distributed Machine Learning [54.26807397329468]
This work explores a previously overlooked vulnerability in distributed deep learning systems.<n>An adversary who intercepts the intermediate features transmitted between them can still pose a serious threat.<n>We propose an exploitation strategy specifically designed for distributed settings.
arXiv Detail & Related papers (2025-07-09T20:09:00Z)
Multimodal Online Federated Learning with Modality Missing in Internet of Things [22.814768356671276]
Internet of Things (IoT) ecosystem generates vast amounts of multimodal data from heterogeneous sources such as sensors, cameras, and microphones.<n>As edge intelligence continues to evolve, IoT devices have progressed from simple data collection units to nodes capable of executing complex computational tasks.<n>We introduce the concept of Multimodal Online Federated Learning (MMO-FL), a novel framework designed for dynamic and decentralized multimodal learning in IoT environments.
arXiv Detail & Related papers (2025-05-22T02:31:37Z)
The Larger the Merrier? Efficient Large AI Model Inference in Wireless Edge Networks [56.37880529653111]
The demand for large computation model (LAIM) services is driving a paradigm shift from traditional cloud-based inference to edge-based inference for low-latency, privacy-preserving applications.<n>In this paper, we investigate the LAIM-inference scheme, where a pre-trained LAIM is pruned and partitioned into on-device and on-server sub-models for deployment.
arXiv Detail & Related papers (2025-05-14T08:18:55Z)
Resilient Peer-to-peer Learning based on Adaptive Aggregation [0.5530212768657544]
Collaborative learning in peer-to-peer networks offers the benefits of learning while mitigating single points of failure. adversarial workers pose potential threats by attempting to inject malicious information into the network. This paper introduces a resilient aggregation technique aimed at fostering similarity learning processes.
arXiv Detail & Related papers (2025-01-08T16:47:45Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents. Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z)
Effective Intrusion Detection in Heterogeneous Internet-of-Things Networks via Ensemble Knowledge Distillation-based Federated Learning [52.6706505729803]
We introduce Federated Learning (FL) to collaboratively train a decentralized shared model of Intrusion Detection Systems (IDS) FLEKD enables a more flexible aggregation method than conventional model fusion techniques. Experiment results show that the proposed approach outperforms local training and traditional FL in terms of both speed and performance.
arXiv Detail & Related papers (2024-01-22T14:16:37Z)
Robust Collaborative Inference with Vertically Split Data Over Dynamic Device Environments [15.757660512833006]
In safety-critical applications, collaborative inference must be robust to significant network failures caused by environmental disruptions or extreme weather. We first formalize the problem of robust collaborative inference over a dynamic network of devices that could experience significant network faults. Then, we develop a minimalistic yet impactful method called Multiple Aggregation with Gossip Rounds and Simulated Faults (MAGS) that synthesizes simulated faults via dropout, replication, and gossiping to significantly improve robustness over baselines.
arXiv Detail & Related papers (2023-12-27T17:00:09Z)
Perceiver-based CDF Modeling for Time Series Forecasting [25.26713741799865]
We propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data. Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction. Experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods.
arXiv Detail & Related papers (2023-10-03T01:13:17Z)
Distributionally Robust Model-based Reinforcement Learning with Large State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment. We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets. We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z)
Fault-Tolerant Collaborative Inference through the Edge-PRUNE Framework [4.984601297028258]
Collaborative inference is a vehicle for distributing computation load, reducing latency, and addressing privacy preservation in communications. This paper presents the Edge-PRUNE distributed computing framework, built on a formally defined model of computation, which provides a flexible infrastructure for fault tolerant collaborative inference.
arXiv Detail & Related papers (2022-06-16T13:16:53Z)
Robust, Deep, and Reinforcement Learning for Management of Communication and Power Networks [6.09170287691728]
The present thesis first develops principled methods to make generic machine learning models robust against distributional uncertainties and adversarial data. We then build on this robust framework to design robust semi-supervised learning over graph methods. The second part of this thesis aspires to fully unleash the potential of next-generation wired and wireless networks.
arXiv Detail & Related papers (2022-02-08T05:49:06Z)
Parallel Successive Learning for Dynamic Distributed Model Training over Heterogeneous Wireless Networks [50.68446003616802]
Federated learning (FedL) has emerged as a popular technique for distributing model training over a set of wireless devices. We develop parallel successive learning (PSL), which expands the FedL architecture along three dimensions. Our analysis sheds light on the notion of cold vs. warmed up models, and model inertia in distributed machine learning.
arXiv Detail & Related papers (2022-02-07T05:11:01Z)
Federated Learning Based on Dynamic Regularization [43.137064459520886]
We propose a novel federated learning method for distributively training neural network models. Server orchestrates cooperation between a subset of randomly chosen devices in each round.
arXiv Detail & Related papers (2021-11-08T03:58:28Z)
Decentralized Local Stochastic Extra-Gradient for Variational Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices. We make a very general assumption on the computational network that covers the settings of fully decentralized calculations. We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.