DeepShare: Sharing ReLU Across Channels and Layers for Efficient Private Inference
- URL: http://arxiv.org/abs/2512.17398v1
- Date: Fri, 19 Dec 2025 09:50:23 GMT
- Title: DeepShare: Sharing ReLU Across Channels and Layers for Efficient Private Inference
- Authors: Yonathan Bornfeld, Shai Avidan,
- Abstract summary: Private Inference (PI) uses cryptographic primitives to perform privacy preserving machine learning.<n>A major computational bottleneck of PI is the calculation of the gate (i.e., ReLU)<n>We propose a new activation module where the DReLU operation is only performed on a subset of the channels.<n>We show that this formulation can drastically reduce the number of DReLU operations in resnet type network.
- Score: 16.43629306994231
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Private Inference (PI) uses cryptographic primitives to perform privacy preserving machine learning. In this setting, the owner of the network runs inference on the data of the client without learning anything about the data and without revealing any information about the model. It has been observed that a major computational bottleneck of PI is the calculation of the gate (i.e., ReLU), so a considerable amount of effort have been devoted to reducing the number of ReLUs in a given network. We focus on the DReLU, which is the non-linear step function of the ReLU and show that one DReLU can serve many ReLU operations. We suggest a new activation module where the DReLU operation is only performed on a subset of the channels (Prototype channels), while the rest of the channels (replicate channels) replicates the DReLU of each of their neurons from the corresponding neurons in one of the prototype channels. We then extend this idea to work across different layers. We show that this formulation can drastically reduce the number of DReLU operations in resnet type network. Furthermore, our theoretical analysis shows that this new formulation can solve an extended version of the XOR problem, using just one non-linearity and two neurons, something that traditional formulations and some PI specific methods cannot achieve. We achieve new SOTA results on several classification setups, and achieve SOTA results on image segmentation.
Related papers
- PReLU: Yet Another Single-Layer Solution to the XOR Problem [0.0]
We show that a single-layer neural network using Parametric Rectified Linear Unit (PReLU) activation can solve the XOR problem.
Our results show that the single-layer PReLU network can achieve 100% success rate in a wider range of learning rates.
arXiv Detail & Related papers (2024-09-17T01:28:40Z) - ReLUs Are Sufficient for Learning Implicit Neural Representations [17.786058035763254]
We revisit the use of ReLU activation functions for learning implicit neural representations.
Inspired by second order B-spline wavelets, we incorporate a set of simple constraints to the ReLU neurons in each layer of a deep neural network (DNN)
We demonstrate that, contrary to popular belief, one can learn state-of-the-art INRs based on a DNN composed of only ReLU neurons.
arXiv Detail & Related papers (2024-06-04T17:51:08Z) - Fixing the NTK: From Neural Network Linearizations to Exact Convex
Programs [63.768739279562105]
We show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data.
A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set.
arXiv Detail & Related papers (2023-09-26T17:42:52Z) - GDP: Stabilized Neural Network Pruning via Gates with Differentiable
Polarization [84.57695474130273]
Gate-based or importance-based pruning methods aim to remove channels whose importance is smallest.
GDP can be plugged before convolutional layers without bells and whistles, to control the on-and-off of each channel.
Experiments conducted over CIFAR-10 and ImageNet datasets show that the proposed GDP achieves the state-of-the-art performance.
arXiv Detail & Related papers (2021-09-06T03:17:10Z) - Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures.
We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels.
Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z) - Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks.
We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z) - Reducing ReLU Count for Privacy-Preserving CNN Speedup [25.86435513157795]
Privacy-Preserving Machine Learning algorithms must balance classification accuracy with data privacy.
CNNs typically consist of two types of operations: a convolutional or linear layer, followed by a non-linear function such as ReLU.
Recent research suggests that ReLU is responsible for most of the communication bandwidth.
We propose to share ReLU operations. Specifically, the ReLU decision of one activation can be used by others, and we explore different ways to determine the ReLU for such a group of activations.
arXiv Detail & Related papers (2021-01-28T06:49:31Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z) - Pooling Methods in Deep Neural Networks, a Review [6.1678491628787455]
pooling layer is an important layer that executes the down-sampling on the feature maps coming from the previous layer.
In this paper, we reviewed some of the famous and useful pooling methods.
arXiv Detail & Related papers (2020-09-16T06:11:40Z) - Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization.
Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.