Related papers: Learning to Linearize Deep Neural Networks for Secure and Efficient Private Inference

Learning to Linearize Deep Neural Networks for Secure and Efficient Private Inference

URL: http://arxiv.org/abs/2301.09254v1
Date: Mon, 23 Jan 2023 03:33:38 GMT
Title: Learning to Linearize Deep Neural Networks for Secure and Efficient Private Inference
Authors: Souvik Kundu, Shunlin Lu, Yuke Zhang, Jacqueline Liu, Peter A. Beerel
Abstract summary: Existing techniques to reduce ReLU operations often involve manual effort and sacrifice accuracy. We first present a novel measure of non-linearity layers' ReLU sensitivity, enabling mitigation of the time-consuming manual efforts. We then present SENet, a three-stage training method that automatically assigns per-layer ReLU counts, decides the ReLU locations for each layer's activation map, and trains a model with significantly fewer ReLUs.
Score: 5.293553970082942
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The large number of ReLU non-linearity operations in existing deep neural networks makes them ill-suited for latency-efficient private inference (PI). Existing techniques to reduce ReLU operations often involve manual effort and sacrifice significant accuracy. In this paper, we first present a novel measure of non-linearity layers' ReLU sensitivity, enabling mitigation of the time-consuming manual efforts in identifying the same. Based on this sensitivity, we then present SENet, a three-stage training method that for a given ReLU budget, automatically assigns per-layer ReLU counts, decides the ReLU locations for each layer's activation map, and trains a model with significantly fewer ReLUs to potentially yield latency and communication efficient PI. Experimental evaluations with multiple models on various datasets show SENet's superior performance both in terms of reduced ReLUs and improved classification accuracy compared to existing alternatives. In particular, SENet can yield models that require up to ~2x fewer ReLUs while yielding similar accuracy. For a similar ReLU budget SENet can yield models with ~2.32% improved classification accuracy, evaluated on CIFAR-100.

Related papers

Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models [2.3949320404005436]
Particle Swarm Optimization and Large Language Models (LLMs) have been individually applied in optimization and deep learning. Our work addresses this gap by integrating LLMs into PSO to reduce model evaluations and improve convergence. Our method speeds up search space exploration by substituting underperforming particle placements with best suggestions.
arXiv Detail & Related papers (2025-04-19T00:54:59Z)
Linearizing Models for Efficient yet Robust Private Inference [8.547509873799994]
This paper presents RLNet, a class of robust linearized networks that can yield latency improvement via reduction of high-latency ReLU operations. We show that RLNet can yield models with up to 11.14x fewer ReLUs, with accuracy close to the all-ReLU models, on clean, naturally perturbed, and gradient-based perturbed images.
arXiv Detail & Related papers (2024-02-08T10:01:29Z)
Fixing the NTK: From Neural Network Linearizations to Exact Convex Programs [63.768739279562105]
We show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data. A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set.
arXiv Detail & Related papers (2023-09-26T17:42:52Z)
Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity [0.0]
Exploiting sparsity in the network's feature maps is one of the ways to reduce its inference latency. We propose a solution to induce semi-structured activation sparsity exploitable through minor runtime modifications. Our approach yields a speed improvement of $1.25 times$ with a minimal accuracy drop of $1.1%$ for the ResNet18 model on the ImageNet dataset.
arXiv Detail & Related papers (2023-09-12T22:28:53Z)
Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference [6.141267142478346]
We present a model optimization method that allows a model to learn to be shallow. We leverage the ReLU sensitivity of a convolutional block to remove a ReLU layer and merge its succeeding and preceding convolution layers to a shallow block.
arXiv Detail & Related papers (2023-04-26T04:23:34Z)
Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures. This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead. We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z)
A Hybrid Deep Learning Model-based Remaining Useful Life Estimation for Reed Relay with Degradation Pattern Clustering [12.631122036403864]
Reed relay serves as the fundamental component of functional testing, which closely relates to the successful quality inspection of electronics. To provide accurate remaining useful life (RUL) estimation for reed relay, a hybrid deep learning network with degradation pattern clustering is proposed.
arXiv Detail & Related papers (2022-09-14T05:45:46Z)
Can pruning improve certified robustness of neural networks? [106.03070538582222]
We show that neural network pruning can improve empirical robustness of deep neural networks (NNs) Our experiments show that by appropriately pruning an NN, its certified accuracy can be boosted up to 8.2% under standard training. We additionally observe the existence of certified lottery tickets that can match both standard and certified robust accuracies of the original dense models.
arXiv Detail & Related papers (2022-06-15T05:48:51Z)
FasterPose: A Faster Simple Baseline for Human Pose Estimation [65.8413964785972]
We propose a design paradigm for cost-effective network with LR representation for efficient pose estimation, named FasterPose. We study the training behavior of FasterPose, and formulate a novel regressive cross-entropy (RCE) loss function for accelerating the convergence. Compared with the previously dominant network of pose estimation, our method reduces 58% of the FLOPs and simultaneously gains 1.3% improvement of accuracy.
arXiv Detail & Related papers (2021-07-07T13:39:08Z)
Neural Pruning via Growing Regularization [82.9322109208353]
We extend regularization to tackle two central problems of pruning: pruning schedule and weight importance scoring. Specifically, we propose an L2 regularization variant with rising penalty factors and show it can bring significant accuracy gains. The proposed algorithms are easy to implement and scalable to large datasets and networks in both structured and unstructured pruning.
arXiv Detail & Related papers (2020-12-16T20:16:28Z)
Parametric Flatten-T Swish: An Adaptive Non-linear Activation Function For Deep Learning [0.0]
Rectified Linear Unit (ReLU) has been the most popular activation function across the deep learning community. This paper introduces Parametric Flatten-T Swish (PFTS) as an alternative to ReLU. PFTS manifested higher non-linear approximation power during training and thereby improved the predictive performance of the networks.
arXiv Detail & Related papers (2020-11-06T01:50:46Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.