RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation
Based Private Inference
- URL: http://arxiv.org/abs/2302.02292v1
- Date: Sun, 5 Feb 2023 04:02:13 GMT
- Title: RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation
Based Private Inference
- Authors: Hongwu Peng, Shanglin Zhou, Yukui Luo, Nuo Xu, Shijin Duan, Ran Ran,
Jiahui Zhao, Shaoyi Huang, Xi Xie, Chenghong Wang, Tong Geng, Wujie Wen,
Xiaolin Xu, and Caiwen Ding
- Abstract summary: We introduce RRNet, a framework that aims to jointly reduce the overhead of MPC comparison protocols and accelerate computation through hardware acceleration.
Our approach integrates the hardware latency of cryptographic building blocks into the DNN loss function, resulting in improved energy efficiency, accuracy, and security guarantees.
- Score: 17.299835585861747
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The proliferation of deep learning (DL) has led to the emergence of privacy
and security concerns. To address these issues, secure Two-party computation
(2PC) has been proposed as a means of enabling privacy-preserving DL
computation. However, in practice, 2PC methods often incur high computation and
communication overhead, which can impede their use in large-scale systems. To
address this challenge, we introduce RRNet, a systematic framework that aims to
jointly reduce the overhead of MPC comparison protocols and accelerate
computation through hardware acceleration. Our approach integrates the hardware
latency of cryptographic building blocks into the DNN loss function, resulting
in improved energy efficiency, accuracy, and security guarantees. Furthermore,
we propose a cryptographic hardware scheduler and corresponding performance
model for Field Programmable Gate Arrays (FPGAs) to further enhance the
efficiency of our framework. Experiments show RRNet achieved a much higher ReLU
reduction performance than all SOTA works on CIFAR-10 dataset.
Related papers
- FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency.
We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs)
We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z) - Digital Twin-Assisted Data-Driven Optimization for Reliable Edge Caching in Wireless Networks [60.54852710216738]
We introduce a novel digital twin-assisted optimization framework, called D-REC, to ensure reliable caching in nextG wireless networks.
By incorporating reliability modules into a constrained decision process, D-REC can adaptively adjust actions, rewards, and states to comply with advantageous constraints.
arXiv Detail & Related papers (2024-06-29T02:40:28Z) - SSNet: A Lightweight Multi-Party Computation Scheme for Practical Privacy-Preserving Machine Learning Service in the Cloud [17.961150835215587]
We propose SSNet, which for the first time employs Shamir's secret sharing (SSS) as the backbone of MPC-based ML framework.
SSNet demonstrates the ability to scale up party numbers straightforwardly and embeds strategies to authenticate the computation correctness.
We conduct comprehensive experimental evaluations on commercial cloud computing infrastructure from Amazon AWS.
arXiv Detail & Related papers (2024-06-04T00:55:06Z) - Trustworthy DNN Partition for Blockchain-enabled Digital Twin in Wireless IIoT Networks [32.42557641803365]
Digital twin (DT) has emerged as a promising solution to enhance manufacturing efficiency in industrial Internet of Things (IIoT) networks.
We propose a blockchain-enabled DT (B-DT) framework that employs deep neural network (DNN) partitioning technique and reputation-based consensus mechanism.
arXiv Detail & Related papers (2024-05-28T07:34:12Z) - Full-Stack Optimization for CAM-Only DNN Inference [2.0837295518447934]
This paper explores the combination of algorithmic optimizations for ternary weight neural networks and associative processors.
We propose a novel compilation flow to optimize convolutions on APs by reducing their arithmetic intensity.
Our solution improves the energy efficiency of ResNet-18 inference on ImageNet by 7.5x compared to crossbar in-memory accelerators.
arXiv Detail & Related papers (2024-01-23T10:27:38Z) - Practical Conformer: Optimizing size, speed and flops of Conformer for
on-Device and cloud ASR [67.63332492134332]
We design an optimized conformer that is small enough to meet on-device restrictions and has fast inference on TPUs.
Our proposed encoder can double as a strong standalone encoder in on device, and as the first part of a high-performance ASR pipeline.
arXiv Detail & Related papers (2023-03-31T23:30:48Z) - Signed Binary Weight Networks [17.07866119979333]
Two important algorithmic techniques have shown promise for enabling efficient inference - sparsity and binarization.
We propose a new method called signed-binary networks to improve efficiency further.
Our method achieves comparable accuracy on ImageNet and CIFAR10 datasets with binary and can lead to 69% sparsity.
arXiv Detail & Related papers (2022-11-25T00:19:21Z) - PolyMPCNet: Towards ReLU-free Neural Architecture Search in Two-party
Computation Based Private Inference [23.795457990555878]
Secure multi-party computation (MPC) has been discussed, to enable the privacy-preserving deep learning (DL) computation.
MPCs often come at very high computation overhead, and potentially prohibit their popularity in large scale systems.
In this work, we develop a systematic framework, PolyMPCNet, of joint overhead reduction of MPC comparison protocol and hardware acceleration.
arXiv Detail & Related papers (2022-09-20T02:47:37Z) - Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks.
specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples.
We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z) - Dynamic Slimmable Network [105.74546828182834]
We develop a dynamic network slimming regime named Dynamic Slimmable Network (DS-Net)
Our DS-Net is empowered with the ability of dynamic inference by the proposed double-headed dynamic gate.
It consistently outperforms its static counterparts as well as state-of-the-art static and dynamic model compression methods.
arXiv Detail & Related papers (2021-03-24T15:25:20Z) - Deep Learning-based Resource Allocation For Device-to-Device
Communication [66.74874646973593]
We propose a framework for the optimization of the resource allocation in multi-channel cellular systems with device-to-device (D2D) communication.
A deep learning (DL) framework is proposed, where the optimal resource allocation strategy for arbitrary channel conditions is approximated by deep neural network (DNN) models.
Our simulation results confirm that near-optimal performance can be attained with low time, which underlines the real-time capability of the proposed scheme.
arXiv Detail & Related papers (2020-11-25T14:19:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.