Related papers: Power-Based Attacks on Spatial DNN Accelerators

Power-Based Attacks on Spatial DNN Accelerators

URL: http://arxiv.org/abs/2108.12579v1
Date: Sat, 28 Aug 2021 05:25:03 GMT
Title: Power-Based Attacks on Spatial DNN Accelerators
Authors: Ge Li, Mohit Tiwari, and Michael Orshansky
Abstract summary: This paper investigates the vulnerability of realistic spatial accelerators using general, 8-bit, number representation. A novel template-based DPA with multiple profiling phases is able to fully break the 2D array with only 40K traces.
Score: 11.536650557854324
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With proliferation of DNN-based applications, the confidentiality of DNN model is an important commercial goal. Spatial accelerators, that parallelize matrix/vector operations, are utilized for enhancing energy efficiency of DNN computation. Recently, model extraction attacks on simple accelerators, either with a single processing element or running a binarized network, were demonstrated using the methodology derived from differential power analysis (DPA) attack on cryptographic devices. This paper investigates the vulnerability of realistic spatial accelerators using general, 8-bit, number representation. We investigate two systolic array architectures with weight-stationary dataflow: (1) a 3 $\times$ 1 array for a dot-product operation, and (2) a 3 $\times$ 3 array for matrix-vector multiplication. Both are implemented on the SAKURA-G FPGA board. We show that both architectures are ultimately vulnerable. A conventional DPA succeeds fully on the 1D array, requiring 20K power measurements. However, the 2D array exhibits higher security even with 460K traces. We show that this is because the 2D array intrinsically entails multiple MACs simultaneously dependent on the same input. However, we find that a novel template-based DPA with multiple profiling phases is able to fully break the 2D array with only 40K traces. Corresponding countermeasures need to be investigated for spatial DNN accelerators.

Related papers

TDPP: Two-Dimensional Permutation-Based Protection of Memristive Deep Neural Networks [17.126478919408132]
Non-volatility of memristive devices may expose the DNN weights stored in memristive crossbars to potential theft attacks. This paper proposes a two-dimensional permutation-based protection (TDPP) method that thwarts such attacks.
arXiv Detail & Related papers (2023-10-10T20:22:17Z)
One-Shot Online Testing of Deep Neural Networks Based on Distribution Shift Detection [0.6091702876917281]
We propose a emphone-shot testing approach that can test NNs accelerated on memristive crossbars with only one test vector. Our approach can consistently achieve $100%$ fault coverage across several large topologies.
arXiv Detail & Related papers (2023-05-16T11:06:09Z)
UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation [93.88170217725805]
We propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters, compute cost, and inference speed. The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features. Our evaluations on five benchmarks, Synapse, BTCV, ACDC, BRaTs, and Decathlon-Lung, reveal the effectiveness of our contributions in terms of both efficiency and accuracy.
arXiv Detail & Related papers (2022-12-08T18:59:57Z)
An efficient and flexible inference system for serving heterogeneous ensembles of deep neural networks [0.0]
Ensembles of Deep Neural Networks (DNNs) have achieved qualitative predictions but they are computing and memory intensive. We propose a new software layer to serve with flexibility and efficiency ensembles of DNNs.
arXiv Detail & Related papers (2022-08-30T08:05:43Z)
Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition [62.83832841523525]
We propose a fast and accurate parallel transformer, termed Paraformer. It accurately predicts the number of output tokens and extract hidden variables. It can attain comparable performance to the state-of-the-art AR transformer, with more than 10x speedup.
arXiv Detail & Related papers (2022-06-16T17:24:14Z)
Density Matrix Renormalization Group with Tensor Processing Units [0.0]
Google's Processing Units (TPUs) are integrated circuits specifically built to accelerate and scale up machine learning workloads. In this work we demonstrate the use of TPUs for accelerating and scaling up the density matrix renormalization group (DMRG), a powerful numerical approach to compute the ground state of a local quantum many-body Hamiltonian.
arXiv Detail & Related papers (2022-04-12T10:40:14Z)
Quantized Neural Networks via {-1, +1} Encoding Decomposition and Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks. We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z)
FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator [33.19099033687952]
FORMS is a fine-grained ReRAM-based DNN accelerator with polarized weights. It achieves significant throughput improvement and speed up in frame per second over ISAAC with similar area cost.
arXiv Detail & Related papers (2021-06-16T21:42:08Z)
VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator. textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z)
Binary DAD-Net: Binarized Driveable Area Detection Network for Autonomous Driving [94.40107679615618]
This paper proposes a novel binarized driveable area detection network (binary DAD-Net) It uses only binary weights and activations in the encoder, the bottleneck, and the decoder part. It outperforms state-of-the-art semantic segmentation networks on public datasets.
arXiv Detail & Related papers (2020-06-15T07:09:01Z)
SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation [97.78417228445883]
We present SmartExchange, an algorithm- hardware co-design framework for energy-efficient inference of deep neural networks (DNNs) We develop a novel algorithm to enforce a specially favorable DNN weight structure, where each layerwise weight matrix can be stored as the product of a small basis matrix and a large sparse coefficient matrix whose non-zero elements are all power-of-2. We further design a dedicated accelerator to fully utilize the SmartExchange-enforced weights to improve both energy efficiency and latency performance.
arXiv Detail & Related papers (2020-05-07T12:12:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.