Related papers: AESPA: Accuracy Preserving Low-degree Polynomial Activation for Fast Private Inference

AESPA: Accuracy Preserving Low-degree Polynomial Activation for Fast Private Inference

URL: http://arxiv.org/abs/2201.06699v1
Date: Tue, 18 Jan 2022 02:02:02 GMT
Title: AESPA: Accuracy Preserving Low-degree Polynomial Activation for Fast Private Inference
Authors: Jaiyoung Park and Michael Jaemin Kim and Wonkyung Jung and Jung Ho Ahn
Abstract summary: We propose an accuracy preserving low-degree activation function (AESPA) that exploits the Hermite expansion of the ReLU and basis-wise normalization. When applied to the all-RELU baseline on the state-of-the-art Delphi PI protocol, AESPA shows up to 42.1x and 28.3x lower online latency and communication cost.
Score: 1.4878320574640147
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hybrid private inference (PI) protocol, which synergistically utilizes both multi-party computation (MPC) and homomorphic encryption, is one of the most prominent techniques for PI. However, even the state-of-the-art PI protocols are bottlenecked by the non-linear layers, especially the activation functions. Although a standard non-linear activation function can generate higher model accuracy, it must be processed via a costly garbled-circuit MPC primitive. A polynomial activation can be processed via Beaver's multiplication triples MPC primitive but has been incurring severe accuracy drops so far. In this paper, we propose an accuracy preserving low-degree polynomial activation function (AESPA) that exploits the Hermite expansion of the ReLU and basis-wise normalization. We apply AESPA to popular ML models, such as VGGNet, ResNet, and pre-activation ResNet, to show an inference accuracy comparable to those of the standard models with ReLU activation, achieving superior accuracy over prior low-degree polynomial studies. When applied to the all-RELU baseline on the state-of-the-art Delphi PI protocol, AESPA shows up to 42.1x and 28.3x lower online latency and communication cost.

Related papers

SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning [49.83621156017321]
SimBa is an architecture designed to scale up parameters in deep RL by injecting a simplicity bias. By scaling up parameters with SimBa, the sample efficiency of various deep RL algorithms-including off-policy, on-policy, and unsupervised methods-is consistently improved.
arXiv Detail & Related papers (2024-10-13T07:20:53Z)
AdaPI: Facilitating DNN Model Adaptivity for Efficient Private Inference in Edge Computing [20.11448308239082]
AdaPI is a novel approach that achieves adaptive PI by allowing a model to perform well across edge devices with diverse energy budgets. AdaPI attains optimal accuracy for each energy budget, which outperforms the state-of-the-art PI methods by 7.3% in terms of test accuracy on CIFAR-100.
arXiv Detail & Related papers (2024-07-08T05:58:49Z)
REBEL: Reinforcement Learning via Regressing Relative Rewards [59.68420022466047]
We propose REBEL, a minimalist RL algorithm for the era of generative models. In theory, we prove that fundamental RL algorithms like Natural Policy Gradient can be seen as variants of REBEL. We find that REBEL provides a unified approach to language modeling and image generation with stronger or similar performance as PPO and DPO.
arXiv Detail & Related papers (2024-04-25T17:20:45Z)
xMLP: Revolutionizing Private Inference with Exclusive Square Activation [27.092753578066294]
Private Inference (PI) enables deep neural networks (DNNs) to work on private data without leaking sensitive information. The use of non-linear activations such as ReLU in DNNs can lead to impractically high PI latency. We propose xMLP, a novel DNN architecture that uses square activations exclusively while maintaining parity in both accuracy and efficiency.
arXiv Detail & Related papers (2024-03-12T18:46:56Z)
Optimized Layerwise Approximation for Efficient Private Inference on Fully Homomorphic Encryption [17.010625600442584]
This study proposes an optimized layerwise approximation (OLA) framework for privacy-preserving deep neural networks. For efficient approximation, we reflect the layerwise accuracy by considering the actual input distribution of each activation function. As a result, the OLA method reduces inference times for the ResNet-20 model and the ResNet-32 model by 3.02 times and 2.82 times, respectively.
arXiv Detail & Related papers (2023-10-16T12:34:47Z)
Reliable Prediction Intervals with Directly Optimized Inductive Conformal Regression for Deep Learning [3.42658286826597]
Predictions intervals (PIs) are used to quantify the uncertainty of each prediction in deep learning regression. Many approaches to improve the quality of PIs can effectively reduce the width of PIs, but they do not ensure that enough real labels are captured. In this study, we use Directly Optimized Inductive Conformal Regression (DOICR) that takes only the average width of PIs as the loss function. Benchmark experiments show that DOICR outperforms current state-of-the-art algorithms for regression problems.
arXiv Detail & Related papers (2023-02-02T04:46:14Z)
Selective Network Linearization for Efficient Private Inference [49.937470642033155]
We propose a gradient-based algorithm that selectively linearizes ReLUs while maintaining prediction accuracy. The results demonstrate up to $4.25%$ more accuracy (iso-ReLU count at 50K) or $2.2times$ less latency (iso-accuracy at 70%) than the current state of the art.
arXiv Detail & Related papers (2022-02-04T19:00:24Z)
Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity [50.38337893712897]
We introduce the Effective Planning Window (EPW) condition, a structural condition on MDPs that makes no linearity assumptions. We demonstrate that the EPW condition permits sample efficient RL, by providing an algorithm which provably solves MDPs satisfying this condition. We additionally show the necessity of conditions like EPW, by demonstrating that simple MDPs with slight nonlinearities cannot be solved sample efficiently.
arXiv Detail & Related papers (2021-06-15T00:06:59Z)
The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games [67.47961797770249]
Multi-Agent PPO (MAPPO) is a multi-agent PPO variant which adopts a centralized value function. We show that MAPPO achieves performance comparable to the state-of-the-art in three popular multi-agent testbeds.
arXiv Detail & Related papers (2021-03-02T18:59:56Z)
HiPPO: Recurrent Memory with Optimal Polynomial Projections [93.3537706398653]
We introduce a general framework (HiPPO) for the online compression of continuous signals and discrete time series by projection onto bases. Given a measure that specifies the importance of each time step in the past, HiPPO produces an optimal solution to a natural online function approximation problem. This formal framework yields a new memory update mechanism (HiPPO-LegS) that scales through time to remember all history, avoiding priors on the timescale.
arXiv Detail & Related papers (2020-08-17T23:39:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.