Serdab: An IoT Framework for Partitioning Neural Networks Computation
across Multiple Enclaves
- URL: http://arxiv.org/abs/2005.06043v1
- Date: Tue, 12 May 2020 20:51:47 GMT
- Title: Serdab: An IoT Framework for Partitioning Neural Networks Computation
across Multiple Enclaves
- Authors: Tarek Elgamal, Klara Nahrstedt
- Abstract summary: Serdab is a distributed orchestration framework for deploying deep neural network across multiple secure enclaves.
Our partitioning strategy achieves up to 4.7x speedup compared to executing the entire neural network in one enclave.
- Score: 8.550865312110911
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in Deep Neural Networks (DNN) and Edge Computing have made it
possible to automatically analyze streams of videos from home/security cameras
over hierarchical clusters that include edge devices, close to the video
source, as well as remote cloud compute resources. However, preserving the
privacy and confidentiality of users' sensitive data as it passes through
different devices remains a concern to most users. Private user data is subject
to attacks by malicious attackers or misuse by internal administrators who may
use the data in activities that are not explicitly approved by the user. To
address this challenge, we present Serdab, a distributed orchestration
framework for deploying deep neural network computation across multiple secure
enclaves (e.g., Intel SGX). Secure enclaves provide a guarantee on the privacy
of the data/code deployed inside it. However, their limited hardware resources
make them inefficient when solely running an entire deep neural network. To
bridge this gap, Serdab presents a DNN partitioning strategy to distribute the
layers of the neural network across multiple enclave devices or across an
enclave device and other hardware accelerators. Our partitioning strategy
achieves up to 4.7x speedup compared to executing the entire neural network in
one enclave.
Related papers
- MatchNAS: Optimizing Edge AI in Sparse-Label Data Contexts via
Automating Deep Neural Network Porting for Mobile Deployment [54.77943671991863]
MatchNAS is a novel scheme for porting Deep Neural Networks to mobile devices.
We optimise a large network family using both labelled and unlabelled data.
We then automatically search for tailored networks for different hardware platforms.
arXiv Detail & Related papers (2024-02-21T04:43:12Z) - Fully Spiking Actor Network with Intra-layer Connections for
Reinforcement Learning [51.386945803485084]
We focus on the task where the agent needs to learn multi-dimensional deterministic policies to control.
Most existing spike-based RL methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully-connected layer.
To develop a fully spiking actor network without any floating-point matrix operations, we draw inspiration from the non-spiking interneurons found in insects.
arXiv Detail & Related papers (2024-01-09T07:31:34Z) - $\Lambda$-Split: A Privacy-Preserving Split Computing Framework for
Cloud-Powered Generative AI [3.363904632882723]
We introduce $Lambda$-Split, a split computing framework to facilitate computational offloading.
In $Lambda$-Split, a generative model, usually a deep neural network (DNN), is partitioned into three sub-models and distributed across the user's local device and a cloud server.
This architecture ensures that only the hidden layer outputs are transmitted, thereby preventing the external transmission of privacy-sensitive raw input and output data.
arXiv Detail & Related papers (2023-10-23T07:44:04Z) - RL-DistPrivacy: Privacy-Aware Distributed Deep Inference for low latency
IoT systems [41.1371349978643]
We present an approach that targets the security of collaborative deep inference via re-thinking the distribution strategy.
We formulate this methodology, as an optimization, where we establish a trade-off between the latency of co-inference and the privacy-level of data.
arXiv Detail & Related papers (2022-08-27T14:50:00Z) - Decentralized Low-Latency Collaborative Inference via Ensembles on the
Edge [28.61344039233783]
We propose to facilitate the application of deep neural networks (DNNs) on the edge by allowing multiple users to collaborate during inference to improve their accuracy.
Our mechanism, coined em edge ensembles, is based on having diverse predictors at each device, which form an ensemble of models during inference.
We analyze the latency induced by edge ensembles, showing that its performance improvement comes at the cost of a minor additional delay under common assumptions on the communication network.
arXiv Detail & Related papers (2022-06-07T10:24:20Z) - Dynamic Split Computing for Efficient Deep Edge Intelligence [78.4233915447056]
We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel.
We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
arXiv Detail & Related papers (2022-05-23T12:35:18Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - NeuraCrypt: Hiding Private Health Data via Random Neural Networks for
Public Training [64.54200987493573]
We propose NeuraCrypt, a private encoding scheme based on random deep neural networks.
NeuraCrypt encodes raw patient data using a randomly constructed neural network known only to the data-owner.
We show that NeuraCrypt achieves competitive accuracy to non-private baselines on a variety of x-ray tasks.
arXiv Detail & Related papers (2021-06-04T13:42:21Z) - Robust error bounds for quantised and pruned neural networks [1.8083503268672914]
Machine learning algorithms are moving towards decentralisation with the data and algorithms stored, and even trained, locally on devices.
The device hardware becomes the main bottleneck for model capability in this set-up, creating a need for slimmed down, more efficient neural networks.
A semi-definite program is introduced to bound the worst-case error caused by pruning or quantising a neural network.
It is hoped that the computed bounds will provide certainty to the performance of these algorithms when deployed on safety-critical systems.
arXiv Detail & Related papers (2020-11-30T22:19:44Z) - Multi-stage Jamming Attacks Detection using Deep Learning Combined with
Kernelized Support Vector Machine in 5G Cloud Radio Access Networks [17.2528983535773]
This research focuses on deploying a multi-stage machine learning-based intrusion detection (ML-IDS) in 5G C-RAN.
It can detect and classify four types of jamming attacks: constant jamming, random jamming, jamming, and reactive jamming.
The final classification accuracy of attacks is 94.51% with a 7.84% false negative rate.
arXiv Detail & Related papers (2020-04-13T17:21:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.