NeuJeans: Private Neural Network Inference with Joint Optimization of Convolution and FHE Bootstrapping
- URL: http://arxiv.org/abs/2312.04356v3
- Date: Sun, 12 Jan 2025 23:49:20 GMT
- Title: NeuJeans: Private Neural Network Inference with Joint Optimization of Convolution and FHE Bootstrapping
- Authors: Jae Hyung Ju, Jaiyoung Park, Jongmin Kim, Minsik Kang, Donghwan Kim, Jung Hee Cheon, Jung Ho Ahn,
- Abstract summary: NeuJeans is an FHE-based solution for the PI of deep convolutional neural networks (CNNs)
We introduce a novel encoding method called Coefficients-in-activation (CinS) encoding.
NeuJeans accelerates the performance of conv2d-Slot sequences by up to 5.68 times compared to state-of-the-art FHE-based PI work.
- Score: 10.82887308632024
- License:
- Abstract: Fully homomorphic encryption (FHE) is a promising cryptographic primitive for realizing private neural network inference (PI) services by allowing a client to fully offload the inference task to a cloud server while keeping the client data oblivious to the server. This work proposes NeuJeans, an FHE-based solution for the PI of deep convolutional neural networks (CNNs). NeuJeans tackles the critical problem of the enormous computational cost for the FHE evaluation of CNNs. We introduce a novel encoding method called Coefficients-in-Slot (CinS) encoding, which enables multiple convolutions in one HE multiplication without costly slot permutations. We further observe that CinS encoding is obtained by conducting the first several steps of the Discrete Fourier Transform (DFT) on a ciphertext in conventional Slot encoding. This property enables us to save the conversion between CinS and Slot encodings as bootstrapping a ciphertext starts with DFT. Exploiting this, we devise optimized execution flows for various two-dimensional convolution (conv2d) operations and apply them to end-to-end CNN implementations. NeuJeans accelerates the performance of conv2d-activation sequences by up to 5.68 times compared to state-of-the-art FHE-based PI work and performs the PI of a CNN at the scale of ImageNet within a mere few seconds.
Related papers
- Efficient Homomorphically Encrypted Convolutional Neural Network Without Rotation [6.03124479597323]
This paper proposes a novel reformulated joint procedure and a new filter coefficient packing scheme to eliminate ciphertext rotations without affecting the security of the HE scheme.
For various plain-20s over the CIFAR-10/100 datasets, our design reduces the running time of the Conv and FC layers by 15.5% and the communication cost between client and server by more than 50%, compared to the best prior design.
arXiv Detail & Related papers (2024-09-08T19:46:25Z) - Toward Practical Privacy-Preserving Convolutional Neural Networks Exploiting Fully Homomorphic Encryption [11.706881389387242]
Homomorphic encryption (FHE) is a viable approach for achieving private inference (PI)
FHE implementation of a CNN faces significant hurdles, primarily due to FHE's substantial computational and memory overhead.
We propose a set of optimizations, which includes GPU/ASIC acceleration, an efficient activation function, and an optimized packing scheme.
arXiv Detail & Related papers (2023-10-25T10:24:35Z) - Efficient Privacy-Preserving Convolutional Spiking Neural Networks with
FHE [1.437446768735628]
Homomorphic Encryption (FHE) is a key technology for privacy-preserving computation.
FHE has limitations in processing continuous non-polynomial functions.
We present a framework called FHE-DiCSNN for homomorphic SNNs.
FHE-DiCSNN achieves an accuracy of 97.94% on ciphertexts, with a loss of only 0.53% compared to the original network's accuracy of 98.47%.
arXiv Detail & Related papers (2023-09-16T15:37:18Z) - Compacting Binary Neural Networks by Sparse Kernel Selection [58.84313343190488]
This paper is motivated by a previously revealed phenomenon that the binary kernels in successful BNNs are nearly power-law distributed.
We develop the Permutation Straight-Through Estimator (PSTE) that is able to not only optimize the selection process end-to-end but also maintain the non-repetitive occupancy of selected codewords.
Experiments verify that our method reduces both the model size and bit-wise computational costs, and achieves accuracy improvements compared with state-of-the-art BNNs under comparable budgets.
arXiv Detail & Related papers (2023-03-25T13:53:02Z) - Compressing CNN Kernels for Videos Using Tucker Decompositions: Towards
Lightweight CNN Applications [2.191505742658975]
Convolutional Neural Networks (CNN) are the state-of-theart in the field of visual computing.
A major problem with CNNs is the large number of floating point operations (FLOPs) required to perform convolutions for large inputs.
We propose a Tuckerdecomposition to compress the convolutional kernel of a pre-trained network for images.
arXiv Detail & Related papers (2022-03-10T11:53:53Z) - COIN++: Data Agnostic Neural Compression [55.27113889737545]
COIN++ is a neural compression framework that seamlessly handles a wide range of data modalities.
We demonstrate the effectiveness of our method by compressing various data modalities.
arXiv Detail & Related papers (2022-01-30T20:12:04Z) - Efficient Representations for Privacy-Preserving Inference [3.330229314824913]
We construct and evaluate private CNNs on the MNIST and CIFAR-10 datasets.
We achieve over a two-fold reduction in the number of operations used for inferences of the CryptoNets architecture.
arXiv Detail & Related papers (2021-10-15T19:03:35Z) - Learning from Images: Proactive Caching with Parallel Convolutional
Neural Networks [94.85780721466816]
A novel framework for proactive caching is proposed in this paper.
It combines model-based optimization with data-driven techniques by transforming an optimization problem into a grayscale image.
Numerical results show that the proposed scheme can reduce 71.6% computation time with only 0.8% additional performance cost.
arXiv Detail & Related papers (2021-08-15T21:32:47Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Towards Efficient Graph Convolutional Networks for Point Cloud Handling [181.59146413326056]
We aim at improving the computational efficiency of graph convolutional networks (GCNs) for learning on point clouds.
A series of experiments show that optimized networks have reduced computational complexity, decreased memory consumption, and accelerated inference speed.
arXiv Detail & Related papers (2021-04-12T17:59:16Z) - Computational optimization of convolutional neural networks using
separated filters architecture [69.73393478582027]
We consider a convolutional neural network transformation that reduces computation complexity and thus speedups neural network processing.
Use of convolutional neural networks (CNN) is the standard approach to image recognition despite the fact they can be too computationally demanding.
arXiv Detail & Related papers (2020-02-18T17:42:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.