Learning to Linearize Deep Neural Networks for Secure and Efficient
Private Inference
- URL: http://arxiv.org/abs/2301.09254v1
- Date: Mon, 23 Jan 2023 03:33:38 GMT
- Title: Learning to Linearize Deep Neural Networks for Secure and Efficient
Private Inference
- Authors: Souvik Kundu, Shunlin Lu, Yuke Zhang, Jacqueline Liu, Peter A. Beerel
- Abstract summary: Existing techniques to reduce ReLU operations often involve manual effort and sacrifice accuracy.
We first present a novel measure of non-linearity layers' ReLU sensitivity, enabling mitigation of the time-consuming manual efforts.
We then present SENet, a three-stage training method that automatically assigns per-layer ReLU counts, decides the ReLU locations for each layer's activation map, and trains a model with significantly fewer ReLUs.
- Score: 5.293553970082942
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The large number of ReLU non-linearity operations in existing deep neural
networks makes them ill-suited for latency-efficient private inference (PI).
Existing techniques to reduce ReLU operations often involve manual effort and
sacrifice significant accuracy. In this paper, we first present a novel measure
of non-linearity layers' ReLU sensitivity, enabling mitigation of the
time-consuming manual efforts in identifying the same. Based on this
sensitivity, we then present SENet, a three-stage training method that for a
given ReLU budget, automatically assigns per-layer ReLU counts, decides the
ReLU locations for each layer's activation map, and trains a model with
significantly fewer ReLUs to potentially yield latency and communication
efficient PI. Experimental evaluations with multiple models on various datasets
show SENet's superior performance both in terms of reduced ReLUs and improved
classification accuracy compared to existing alternatives. In particular, SENet
can yield models that require up to ~2x fewer ReLUs while yielding similar
accuracy. For a similar ReLU budget SENet can yield models with ~2.32% improved
classification accuracy, evaluated on CIFAR-100.
Related papers
- Linearizing Models for Efficient yet Robust Private Inference [8.547509873799994]
This paper presents RLNet, a class of robust linearized networks that can yield latency improvement via reduction of high-latency ReLU operations.
We show that RLNet can yield models with up to 11.14x fewer ReLUs, with accuracy close to the all-ReLU models, on clean, naturally perturbed, and gradient-based perturbed images.
arXiv Detail & Related papers (2024-02-08T10:01:29Z) - Fixing the NTK: From Neural Network Linearizations to Exact Convex
Programs [63.768739279562105]
We show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data.
A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set.
arXiv Detail & Related papers (2023-09-26T17:42:52Z) - Accelerating Deep Neural Networks via Semi-Structured Activation
Sparsity [0.0]
Exploiting sparsity in the network's feature maps is one of the ways to reduce its inference latency.
We propose a solution to induce semi-structured activation sparsity exploitable through minor runtime modifications.
Our approach yields a speed improvement of $1.25 times$ with a minimal accuracy drop of $1.1%$ for the ResNet18 model on the ImageNet dataset.
arXiv Detail & Related papers (2023-09-12T22:28:53Z) - Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity
and Depth for Latency-Efficient Private Inference [6.141267142478346]
We present a model optimization method that allows a model to learn to be shallow.
We leverage the ReLU sensitivity of a convolutional block to remove a ReLU layer and merge its succeeding and preceding convolution layers to a shallow block.
arXiv Detail & Related papers (2023-04-26T04:23:34Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - A Hybrid Deep Learning Model-based Remaining Useful Life Estimation for
Reed Relay with Degradation Pattern Clustering [12.631122036403864]
Reed relay serves as the fundamental component of functional testing, which closely relates to the successful quality inspection of electronics.
To provide accurate remaining useful life (RUL) estimation for reed relay, a hybrid deep learning network with degradation pattern clustering is proposed.
arXiv Detail & Related papers (2022-09-14T05:45:46Z) - Can pruning improve certified robustness of neural networks? [106.03070538582222]
We show that neural network pruning can improve empirical robustness of deep neural networks (NNs)
Our experiments show that by appropriately pruning an NN, its certified accuracy can be boosted up to 8.2% under standard training.
We additionally observe the existence of certified lottery tickets that can match both standard and certified robust accuracies of the original dense models.
arXiv Detail & Related papers (2022-06-15T05:48:51Z) - FasterPose: A Faster Simple Baseline for Human Pose Estimation [65.8413964785972]
We propose a design paradigm for cost-effective network with LR representation for efficient pose estimation, named FasterPose.
We study the training behavior of FasterPose, and formulate a novel regressive cross-entropy (RCE) loss function for accelerating the convergence.
Compared with the previously dominant network of pose estimation, our method reduces 58% of the FLOPs and simultaneously gains 1.3% improvement of accuracy.
arXiv Detail & Related papers (2021-07-07T13:39:08Z) - Neural Pruning via Growing Regularization [82.9322109208353]
We extend regularization to tackle two central problems of pruning: pruning schedule and weight importance scoring.
Specifically, we propose an L2 regularization variant with rising penalty factors and show it can bring significant accuracy gains.
The proposed algorithms are easy to implement and scalable to large datasets and networks in both structured and unstructured pruning.
arXiv Detail & Related papers (2020-12-16T20:16:28Z) - Parametric Flatten-T Swish: An Adaptive Non-linear Activation Function
For Deep Learning [0.0]
Rectified Linear Unit (ReLU) has been the most popular activation function across the deep learning community.
This paper introduces Parametric Flatten-T Swish (PFTS) as an alternative to ReLU.
PFTS manifested higher non-linear approximation power during training and thereby improved the predictive performance of the networks.
arXiv Detail & Related papers (2020-11-06T01:50:46Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.