Related papers: Toward Practical Privacy-Preserving Convolutional Neural Networks Exploiting Fully Homomorphic Encryption

Toward Practical Privacy-Preserving Convolutional Neural Networks Exploiting Fully Homomorphic Encryption

URL: http://arxiv.org/abs/2310.16530v1
Date: Wed, 25 Oct 2023 10:24:35 GMT
Title: Toward Practical Privacy-Preserving Convolutional Neural Networks Exploiting Fully Homomorphic Encryption
Authors: Jaiyoung Park, Donghwan Kim, Jongmin Kim, Sangpyo Kim, Wonkyung Jung, Jung Hee Cheon, Jung Ho Ahn,
Abstract summary: Homomorphic encryption (FHE) is a viable approach for achieving private inference (PI) FHE implementation of a CNN faces significant hurdles, primarily due to FHE's substantial computational and memory overhead. We propose a set of optimizations, which includes GPU/ASIC acceleration, an efficient activation function, and an optimized packing scheme.
Score: 11.706881389387242
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Incorporating fully homomorphic encryption (FHE) into the inference process of a convolutional neural network (CNN) draws enormous attention as a viable approach for achieving private inference (PI). FHE allows delegating the entire computation process to the server while ensuring the confidentiality of sensitive client-side data. However, practical FHE implementation of a CNN faces significant hurdles, primarily due to FHE's substantial computational and memory overhead. To address these challenges, we propose a set of optimizations, which includes GPU/ASIC acceleration, an efficient activation function, and an optimized packing scheme. We evaluate our method using the ResNet models on the CIFAR-10 and ImageNet datasets, achieving several orders of magnitude improvement compared to prior work and reducing the latency of the encrypted CNN inference to 1.4 seconds on an NVIDIA A100 GPU. We also show that the latency drops to a mere 0.03 seconds with a custom hardware design.

Related papers

Optimizing DNN Inference on Multi-Accelerator SoCs at Training-time [5.05866540830123]
We present ODiMO, a hardware-aware tool that efficiently explores fine-grain mapping of Deep Neural Networks (DNNs) among various on-chip CUs. We show that ODiMO reduces the latency of a DNN executed on the Darkside by up to 8x at iso-accuracy, compared to a manual mappings. When targeting energy, ODiMO produced up to 50.8x more efficient mappings, with minimal accuracy drop.
arXiv Detail & Related papers (2024-09-27T09:10:44Z)
Embedded Graph Convolutional Networks for Real-Time Event Data Processing on SoC FPGAs [0.815557531820863]
Event cameras find significant relevance for their integration into embedded real-time systems. One effective approach to ensure the necessary throughput and latency for event processing systems is through the utilisation of graph convolutional networks (GCNs) We introduce a series of hardware-aware optimisations tailored for PointNet++, a GCN architecture designed for point cloud processing.
arXiv Detail & Related papers (2024-06-11T14:47:36Z)
NeuJeans: Private Neural Network Inference with Joint Optimization of Convolution and FHE Bootstrapping [10.82887308632024]
NeuJeans is an FHE-based solution for the PI of deep convolutional neural networks (CNNs) We introduce a novel encoding method called Coefficients-in-activation (CinS) encoding. NeuJeans accelerates the performance of conv2d-Slot sequences by up to 5.68 times compared to state-of-the-art FHE-based PI work.
arXiv Detail & Related papers (2023-12-07T15:23:07Z)
Efficient Privacy-Preserving Convolutional Spiking Neural Networks with FHE [1.437446768735628]
Homomorphic Encryption (FHE) is a key technology for privacy-preserving computation. FHE has limitations in processing continuous non-polynomial functions. We present a framework called FHE-DiCSNN for homomorphic SNNs. FHE-DiCSNN achieves an accuracy of 97.94% on ciphertexts, with a loss of only 0.53% compared to the original network's accuracy of 98.47%.
arXiv Detail & Related papers (2023-09-16T15:37:18Z)
Latency-aware Unified Dynamic Networks for Efficient Image Recognition [72.8951331472913]
LAUDNet is a framework to bridge the theoretical and practical efficiency gap in dynamic networks. It integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping. It can notably reduce the latency of models like ResNet by over 50% on platforms such as V100,3090, and TX2 GPUs.
arXiv Detail & Related papers (2023-08-30T10:57:41Z)
HyPHEN: A Hybrid Packing Method and Optimizations for Homomorphic Encryption-Based Neural Networks [7.642103082787977]
Convolutional neural network (CNN) inference using fully homomorphic encryption (FHE) is a promising private inference (PI) solution. We present HyPHEN, a deep HCNN construction that incorporates novel convolution algorithms and data packing methods. As a result, HyPHEN brings the latency of HCNN CIFAR-10 inference down to a practical level at 1.4 seconds (ResNet-20) and demonstrates HCNN ImageNet inference for the first time at 14.7 seconds (ResNet-18).
arXiv Detail & Related papers (2023-02-05T15:36:51Z)
Attention-based Feature Compression for CNN Inference Offloading in Edge Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems. We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device. Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Quantized Neural Networks via {-1, +1} Encoding Decomposition and Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks. We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z)
Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks. specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples. We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z)
FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation [81.76975488010213]
Dense optical flow estimation plays a key role in many robotic vision tasks. Current networks often occupy large number of parameters and require heavy computation costs. Our proposed FastFlowNet works in the well-known coarse-to-fine manner with following innovations.
arXiv Detail & Related papers (2021-03-08T03:09:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.