CompactTag: Minimizing Computation Overheads in Actively-Secure MPC for Deep Neural Networks
- URL: http://arxiv.org/abs/2311.04406v1
- Date: Wed, 8 Nov 2023 00:18:08 GMT
- Title: CompactTag: Minimizing Computation Overheads in Actively-Secure MPC for Deep Neural Networks
- Authors: Yongqin Wang, Pratik Sarkar, Nishat Koti, Arpita Patra, Murali Annavaram,
- Abstract summary: We introduce CompactTag, a lightweight algorithm for generating MAC tags specifically tailored for linear layers in machine learning (ML) applications.
CompactTag speeds up this tag computation bottleneck by up to 23x, resulting in up to 1.47x total online phase runtime speedups for various ML workloads.
- Score: 16.39761637882153
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Secure Multiparty Computation (MPC) protocols enable secure evaluation of a circuit by several parties, even in the presence of an adversary who maliciously corrupts all but one of the parties. These MPC protocols are constructed using the well-known secret-sharing-based paradigm (SPDZ and SPDZ2k), where the protocols ensure security against a malicious adversary by computing Message Authentication Code (MAC) tags on the input shares and then evaluating the circuit with these input shares and tags. However, this tag computation adds a significant runtime overhead, particularly for machine learning (ML) applications with numerous linear computation layers such as convolutions and fully connected layers. To alleviate the tag computation overhead, we introduce CompactTag, a lightweight algorithm for generating MAC tags specifically tailored for linear layers in ML. Linear layer operations in ML, including convolutions, can be transformed into Toeplitz matrix multiplications. For the multiplication of two matrices with dimensions T1 x T2 and T2 x T3 respectively, SPDZ2k required O(T1 x T2 x T3) local multiplications for the tag computation. In contrast, CompactTag only requires O(T1 x T2 + T1 x T3 + T2 x T3) local multiplications, resulting in a substantial performance boost for various ML models. We empirically compared our protocol to the SPDZ2k protocol for various ML circuits, including ResNet Training-Inference, Transformer Training-Inference, and VGG16 Training-Inference. SPDZ2k dedicated around 30% of its online runtime for tag computation. CompactTag speeds up this tag computation bottleneck by up to 23x, resulting in up to 1.47x total online phase runtime speedups for various ML workloads.
Related papers
- Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask [73.5088336210358]
AMD performs parallel NAR inference within contiguous blocks of output labels concealed using attention masks.
A beam search algorithm is designed to leverage a dynamic fusion of CTC, AR Decoder, and AMD probabilities.
Experiments on the LibriSpeech-100hr corpus suggest the tripartite Decoder incorporating the AMD module produces a maximum decoding speed-up ratio of 1.73x.
arXiv Detail & Related papers (2024-06-14T13:42:38Z) - Efficient Transformer Encoders for Mask2Former-style models [57.54752243522298]
ECO-M2F is a strategy to self-select the number of hidden layers in the encoder conditioned on the input image.
The proposed approach reduces expected encoder computational cost while maintaining performance.
It is flexible in architecture configurations, and can be extended beyond the segmentation task to object detection.
arXiv Detail & Related papers (2024-04-23T17:26:34Z) - Extreme Compression of Large Language Models via Additive Quantization [59.3122859349777]
AQLM is first scheme that is optimal in terms of accuracy-vs-model-size when compressing to less than 3 bits per parameter.
We provide fast GPU and CPU implementations of AQLM for token generation.
arXiv Detail & Related papers (2024-01-11T18:54:44Z) - Secure and Efficient Two-party Quantum Scalar Product Protocol With
Application to Privacy-preserving Matrix Multiplication [2.770988618353868]
Two-party quantum scalar product (S2SP) is a promising research area within secure multiparty computation (SMC)
Existing quantum S2SP protocols are not efficient enough, and the complexity is usually close to exponential level.
In this paper, a novel secure two-party quantum scalar product (S2QSP) protocol based on Fourier states is proposed to achieve higher efficiency.
arXiv Detail & Related papers (2023-09-23T14:33:46Z) - AdaMTL: Adaptive Input-dependent Inference for Efficient Multi-Task
Learning [1.4963011898406864]
We introduce AdaMTL, an adaptive framework that learns task-aware inference policies for multi-task learning models.
AdaMTL reduces the computational complexity by 43% while improving the accuracy by 1.32% compared to single-task models.
When deployed on Vuzix M4000 smart glasses, AdaMTL reduces the inference latency and the energy consumption by up to 21.8% and 37.5%, respectively.
arXiv Detail & Related papers (2023-04-17T20:17:44Z) - MPC-Pipe: an Efficient Pipeline Scheme for Secure Multi-party Machine
Learning Inference [3.1853566662905943]
Multi-party computing (MPC) has been gaining popularity over the past years as a secure computing model.
MPC has fewer overheads than homomorphic encryption (HE) and has a more robust threat model than hardware-based trusted execution environments.
MPC protocols still pay substantial performance penalties compared to plaintext when applied to machine learning algorithms.
arXiv Detail & Related papers (2022-09-27T19:16:26Z) - Lightweight and Progressively-Scalable Networks for Semantic
Segmentation [100.63114424262234]
Multi-scale learning frameworks have been regarded as a capable class of models to boost semantic segmentation.
In this paper, we thoroughly analyze the design of convolutional blocks and the ways of interactions across multiple scales.
We devise Lightweight and Progressively-Scalable Networks (LPS-Net) that novelly expands the network complexity in a greedy manner.
arXiv Detail & Related papers (2022-07-27T16:00:28Z) - Block-Recurrent Transformers [49.07682696216708]
We introduce the Block-Recurrent Transformer, which applies a transformer layer in a recurrent fashion along a sequence.
Our recurrent cell operates on blocks of tokens rather than single tokens, and leverages parallel computation within a block in order to make efficient use of accelerator hardware.
arXiv Detail & Related papers (2022-03-11T23:44:33Z) - HD-cos Networks: Efficient Neural Architectures for Secure Multi-Party
Computation [26.67099154998755]
Multi-party computation (MPC) is a branch of cryptography where multiple non-colluding parties execute a protocol to securely compute a function.
We study training and inference of neural networks under the MPC setup.
We show that both of the approaches enjoy strong theoretical motivations and efficient computation under the MPC setup.
arXiv Detail & Related papers (2021-10-28T21:15:11Z) - Adam in Private: Secure and Fast Training of Deep Neural Networks with
Adaptive Moment Estimation [6.342794803074475]
We propose a framework that allows efficient evaluation of full-fledged state-of-the-art machine learning algorithms.
This is in contrast to most prior works, which substitute ML algorithms with approximated "MPC-friendly" variants.
We obtain secure training that outperforms state-of-the-art three-party systems.
arXiv Detail & Related papers (2021-06-04T01:40:09Z) - Taurus: A Data Plane Architecture for Per-Packet ML [59.1343317736213]
We present the design and implementation of Taurus, a data plane for line-rate inference.
Our evaluation of a Taurus switch ASIC shows that Taurus operates orders of magnitude faster than a server-based control plane.
arXiv Detail & Related papers (2020-02-12T09:18:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.