Power-Based Attacks on Spatial DNN Accelerators
- URL: http://arxiv.org/abs/2108.12579v1
- Date: Sat, 28 Aug 2021 05:25:03 GMT
- Title: Power-Based Attacks on Spatial DNN Accelerators
- Authors: Ge Li, Mohit Tiwari, and Michael Orshansky
- Abstract summary: This paper investigates the vulnerability of realistic spatial accelerators using general, 8-bit, number representation.
A novel template-based DPA with multiple profiling phases is able to fully break the 2D array with only 40K traces.
- Score: 11.536650557854324
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With proliferation of DNN-based applications, the confidentiality of DNN
model is an important commercial goal. Spatial accelerators, that parallelize
matrix/vector operations, are utilized for enhancing energy efficiency of DNN
computation. Recently, model extraction attacks on simple accelerators, either
with a single processing element or running a binarized network, were
demonstrated using the methodology derived from differential power analysis
(DPA) attack on cryptographic devices. This paper investigates the
vulnerability of realistic spatial accelerators using general, 8-bit, number
representation.
We investigate two systolic array architectures with weight-stationary
dataflow: (1) a 3 $\times$ 1 array for a dot-product operation, and (2) a 3
$\times$ 3 array for matrix-vector multiplication. Both are implemented on the
SAKURA-G FPGA board. We show that both architectures are ultimately vulnerable.
A conventional DPA succeeds fully on the 1D array, requiring 20K power
measurements. However, the 2D array exhibits higher security even with 460K
traces. We show that this is because the 2D array intrinsically entails
multiple MACs simultaneously dependent on the same input. However, we find that
a novel template-based DPA with multiple profiling phases is able to fully
break the 2D array with only 40K traces. Corresponding countermeasures need to
be investigated for spatial DNN accelerators.
Related papers
- TDPP: Two-Dimensional Permutation-Based Protection of Memristive Deep Neural Networks [17.126478919408132]
Non-volatility of memristive devices may expose the DNN weights stored in memristive crossbars to potential theft attacks.
This paper proposes a two-dimensional permutation-based protection (TDPP) method that thwarts such attacks.
arXiv Detail & Related papers (2023-10-10T20:22:17Z) - One-Shot Online Testing of Deep Neural Networks Based on Distribution
Shift Detection [0.6091702876917281]
We propose a emphone-shot testing approach that can test NNs accelerated on memristive crossbars with only one test vector.
Our approach can consistently achieve $100%$ fault coverage across several large topologies.
arXiv Detail & Related papers (2023-05-16T11:06:09Z) - UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation [93.88170217725805]
We propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters, compute cost, and inference speed.
The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features.
Our evaluations on five benchmarks, Synapse, BTCV, ACDC, BRaTs, and Decathlon-Lung, reveal the effectiveness of our contributions in terms of both efficiency and accuracy.
arXiv Detail & Related papers (2022-12-08T18:59:57Z) - An efficient and flexible inference system for serving heterogeneous
ensembles of deep neural networks [0.0]
Ensembles of Deep Neural Networks (DNNs) have achieved qualitative predictions but they are computing and memory intensive.
We propose a new software layer to serve with flexibility and efficiency ensembles of DNNs.
arXiv Detail & Related papers (2022-08-30T08:05:43Z) - Paraformer: Fast and Accurate Parallel Transformer for
Non-autoregressive End-to-End Speech Recognition [62.83832841523525]
We propose a fast and accurate parallel transformer, termed Paraformer.
It accurately predicts the number of output tokens and extract hidden variables.
It can attain comparable performance to the state-of-the-art AR transformer, with more than 10x speedup.
arXiv Detail & Related papers (2022-06-16T17:24:14Z) - Density Matrix Renormalization Group with Tensor Processing Units [0.0]
Google's Processing Units (TPUs) are integrated circuits specifically built to accelerate and scale up machine learning workloads.
In this work we demonstrate the use of TPUs for accelerating and scaling up the density matrix renormalization group (DMRG), a powerful numerical approach to compute the ground state of a local quantum many-body Hamiltonian.
arXiv Detail & Related papers (2022-04-12T10:40:14Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for
Mixed-signal DNN Accelerator [33.19099033687952]
FORMS is a fine-grained ReRAM-based DNN accelerator with polarized weights.
It achieves significant throughput improvement and speed up in frame per second over ISAAC with similar area cost.
arXiv Detail & Related papers (2021-06-16T21:42:08Z) - VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator.
textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z) - Binary DAD-Net: Binarized Driveable Area Detection Network for
Autonomous Driving [94.40107679615618]
This paper proposes a novel binarized driveable area detection network (binary DAD-Net)
It uses only binary weights and activations in the encoder, the bottleneck, and the decoder part.
It outperforms state-of-the-art semantic segmentation networks on public datasets.
arXiv Detail & Related papers (2020-06-15T07:09:01Z) - SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost
Computation [97.78417228445883]
We present SmartExchange, an algorithm- hardware co-design framework for energy-efficient inference of deep neural networks (DNNs)
We develop a novel algorithm to enforce a specially favorable DNN weight structure, where each layerwise weight matrix can be stored as the product of a small basis matrix and a large sparse coefficient matrix whose non-zero elements are all power-of-2.
We further design a dedicated accelerator to fully utilize the SmartExchange-enforced weights to improve both energy efficiency and latency performance.
arXiv Detail & Related papers (2020-05-07T12:12:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.