Efficient ML Models for Practical Secure Inference
- URL: http://arxiv.org/abs/2209.00411v1
- Date: Fri, 26 Aug 2022 09:42:21 GMT
- Title: Efficient ML Models for Practical Secure Inference
- Authors: Vinod Ganesan, Anwesh Bhattacharya, Pratyush Kumar, Divya Gupta, Rahul
Sharma, Nishanth Chandran
- Abstract summary: cryptographic primitives allow inference without revealing users' inputs to a model provider or model's weights to a user.
Secure inference is in principle feasible for this setting, but there are no existing techniques that make it practical at scale.
We show that the primary bottlenecks in secure inference are large linear layers which can be optimized with the choice of network backbone.
- Score: 9.536081713410812
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: ML-as-a-service continues to grow, and so does the need for very strong
privacy guarantees. Secure inference has emerged as a potential solution,
wherein cryptographic primitives allow inference without revealing users'
inputs to a model provider or model's weights to a user. For instance, the
model provider could be a diagnostics company that has trained a
state-of-the-art DenseNet-121 model for interpreting a chest X-ray and the user
could be a patient at a hospital. While secure inference is in principle
feasible for this setting, there are no existing techniques that make it
practical at scale. The CrypTFlow2 framework provides a potential solution with
its ability to automatically and correctly translate clear-text inference to
secure inference for arbitrary models. However, the resultant secure inference
from CrypTFlow2 is impractically expensive: Almost 3TB of communication is
required to interpret a single X-ray on DenseNet-121. In this paper, we address
this outstanding challenge of inefficiency of secure inference with three
contributions. First, we show that the primary bottlenecks in secure inference
are large linear layers which can be optimized with the choice of network
backbone and the use of operators developed for efficient clear-text inference.
This finding and emphasis deviates from many recent works which focus on
optimizing non-linear activation layers when performing secure inference of
smaller networks. Second, based on analysis of a bottle-necked convolution
layer, we design a X-operator which is a more efficient drop-in replacement.
Third, we show that the fast Winograd convolution algorithm further improves
efficiency of secure inference. In combination, these three optimizations prove
to be highly effective for the problem of X-ray interpretation trained on the
CheXpert dataset.
Related papers
- TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference [0.0]
Large language models (LLMs) have proven to be very capable, but access to the best models currently rely on inference providers which introduces trust challenges.
We propose TOPLOC, a novel method for verifiable inference that addresses this problem.
arXiv Detail & Related papers (2025-01-27T12:46:45Z) - OTAD: An Optimal Transport-Induced Robust Model for Agnostic Adversarial Attack [7.824226954174748]
Deep neural networks (DNNs) are vulnerable to small adversarial perturbations of the inputs.
We present a novel two-step Optimal Transport induced Adversarial Defense model.
OTAD can fit the training data accurately while preserving the local Lipschitz continuity.
arXiv Detail & Related papers (2024-08-01T07:04:18Z) - Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - VeriFlow: Modeling Distributions for Neural Network Verification [4.3012765978447565]
Formal verification has emerged as a promising method to ensure the safety and reliability of neural networks.
We propose the VeriFlow architecture as a flow based density model tailored to allow any verification approach to restrict its search to the some data distribution of interest.
arXiv Detail & Related papers (2024-06-20T12:41:39Z) - SVNet: Where SO(3) Equivariance Meets Binarization on Point Cloud
Representation [65.4396959244269]
The paper tackles the challenge by designing a general framework to construct 3D learning architectures.
The proposed approach can be applied to general backbones like PointNet and DGCNN.
Experiments on ModelNet40, ShapeNet, and the real-world dataset ScanObjectNN, demonstrated that the method achieves a great trade-off between efficiency, rotation, and accuracy.
arXiv Detail & Related papers (2022-09-13T12:12:19Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness [172.61581010141978]
Certifiable robustness is a desirable property for adopting deep neural networks (DNNs) in safety-critical scenarios.
We propose a novel solution to strategically manipulate neurons, by "grafting" appropriate levels of linearity.
arXiv Detail & Related papers (2022-06-15T22:42:29Z) - Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding [62.17020485045456]
It is commonly assumed in semi-supervised learning (SSL) that the unlabeled data are drawn from the same distribution as that of the labeled ones.
We propose to selectively utilize unlabeled data through sample weighting, so that only conducive unlabeled data would be prioritized.
arXiv Detail & Related papers (2022-05-02T16:09:17Z) - Scalable Uncertainty Quantification for Deep Operator Networks using
Randomized Priors [14.169588600819546]
We present a simple and effective approach for posterior uncertainty quantification in deep operator networks (DeepONets)
We adopt a frequentist approach based on randomized prior ensembles, and put forth an efficient vectorized implementation for fast parallel inference on accelerated hardware.
arXiv Detail & Related papers (2022-03-06T20:48:16Z) - SOTERIA: In Search of Efficient Neural Networks for Private Inference [15.731520890265545]
ML-as-a-service is gaining popularity where a cloud server hosts a trained model and offers prediction (inference) service to users.
In this setting, our objective is to protect the confidentiality of both the users' input queries as well as the model parameters at the server.
We propose SOTERIA, a training method to construct model architectures that are by-design efficient for private inference.
arXiv Detail & Related papers (2020-07-25T13:53:02Z) - Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by
Enabling Input-Adaptive Inference [119.19779637025444]
Deep networks were recently suggested to face the odds between accuracy (on clean natural images) and robustness (on adversarially perturbed images)
This paper studies multi-exit networks associated with input-adaptive inference, showing their strong promise in achieving a "sweet point" in cooptimizing model accuracy, robustness and efficiency.
arXiv Detail & Related papers (2020-02-24T00:40:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.