GECKO: Reconciling Privacy, Accuracy and Efficiency in Embedded Deep
Learning
- URL: http://arxiv.org/abs/2010.00912v3
- Date: Sun, 9 Jan 2022 14:33:25 GMT
- Title: GECKO: Reconciling Privacy, Accuracy and Efficiency in Embedded Deep
Learning
- Authors: Vasisht Duddu, Antoine Boutet, Virat Shejwalkar
- Abstract summary: We analyse the three-dimensional privacy-accuracy-efficiency tradeoff in NNs for IoT devices.
We propose Gecko training methodology where we explicitly add resistance to private inferences as a design objective.
- Score: 5.092028049119383
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Embedded systems demand on-device processing of data using Neural Networks
(NNs) while conforming to the memory, power and computation constraints,
leading to an efficiency and accuracy tradeoff. To bring NNs to edge devices,
several optimizations such as model compression through pruning, quantization,
and off-the-shelf architectures with efficient design have been extensively
adopted. These algorithms when deployed to real world sensitive applications,
requires to resist inference attacks to protect privacy of users training data.
However, resistance against inference attacks is not accounted for designing NN
models for IoT. In this work, we analyse the three-dimensional
privacy-accuracy-efficiency tradeoff in NNs for IoT devices and propose Gecko
training methodology where we explicitly add resistance to private inferences
as a design objective. We optimize the inference-time memory, computation, and
power constraints of embedded devices as a criterion for designing NN
architecture while also preserving privacy. We choose quantization as design
choice for highly efficient and private models. This choice is driven by the
observation that compressed models leak more information compared to baseline
models while off-the-shelf efficient architectures indicate poor efficiency and
privacy tradeoff. We show that models trained using Gecko methodology are
comparable to prior defences against black-box membership attacks in terms of
accuracy and privacy while providing efficiency.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Memory-Efficient and Secure DNN Inference on TrustZone-enabled Consumer IoT Devices [9.928745904761358]
Edge intelligence enables resource-demanding Deep Neural Network (DNN) inference without transferring original data.
For privacy-sensitive applications, deploying models in hardware-isolated trusted execution environments (TEEs) becomes essential.
We present a novel approach for advanced model deployment in TrustZone that ensures comprehensive privacy preservation during model inference.
arXiv Detail & Related papers (2024-03-19T09:22:50Z) - Theoretically Principled Federated Learning for Balancing Privacy and
Utility [61.03993520243198]
We propose a general learning framework for the protection mechanisms that protects privacy via distorting model parameters.
It can achieve personalized utility-privacy trade-off for each model parameter, on each client, at each communication round in federated learning.
arXiv Detail & Related papers (2023-05-24T13:44:02Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - RL-DistPrivacy: Privacy-Aware Distributed Deep Inference for low latency
IoT systems [41.1371349978643]
We present an approach that targets the security of collaborative deep inference via re-thinking the distribution strategy.
We formulate this methodology, as an optimization, where we establish a trade-off between the latency of co-inference and the privacy-level of data.
arXiv Detail & Related papers (2022-08-27T14:50:00Z) - Adversarially Robust and Explainable Model Compression with On-Device
Personalization for Text Classification [4.805959718658541]
On-device Deep Neural Networks (DNNs) have recently gained more attention due to the increasing computing power of mobile devices and the number of applications in Computer Vision (CV) and Natural Language Processing (NLP)
In NLP applications, although model compression has seen initial success, there are at least three major challenges yet to be addressed: adversarial robustness, explainability, and personalization.
Here we attempt to tackle these challenges by designing a new training scheme for model compression and adversarial robustness, including the optimization of an explainable feature mapping objective.
The resulting compressed model is personalized using on-device private training data via fine-
arXiv Detail & Related papers (2021-01-10T15:06:55Z) - SOTERIA: In Search of Efficient Neural Networks for Private Inference [15.731520890265545]
ML-as-a-service is gaining popularity where a cloud server hosts a trained model and offers prediction (inference) service to users.
In this setting, our objective is to protect the confidentiality of both the users' input queries as well as the model parameters at the server.
We propose SOTERIA, a training method to construct model architectures that are by-design efficient for private inference.
arXiv Detail & Related papers (2020-07-25T13:53:02Z) - A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration
Framework [56.57225686288006]
Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices.
Previous pruning methods mainly focus on reducing the model size and/or improving performance without considering the privacy of user data.
We propose a privacy-preserving-oriented pruning and mobile acceleration framework that does not require the private training dataset.
arXiv Detail & Related papers (2020-03-13T23:52:03Z) - CryptoSPN: Privacy-preserving Sum-Product Network Inference [84.88362774693914]
We present a framework for privacy-preserving inference of sum-product networks (SPNs)
CryptoSPN achieves highly efficient and accurate inference in the order of seconds for medium-sized SPNs.
arXiv Detail & Related papers (2020-02-03T14:49:18Z) - An Image Enhancing Pattern-based Sparsity for Real-time Inference on
Mobile Devices [58.62801151916888]
We introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly.
Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms.
arXiv Detail & Related papers (2020-01-20T16:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.