Related papers: Side-Channel Extraction of Dataflow AI Accelerator Hardware Parameters

Side-Channel Extraction of Dataflow AI Accelerator Hardware Parameters

URL: http://arxiv.org/abs/2506.15432v1
Date: Wed, 18 Jun 2025 13:06:09 GMT
Title: Side-Channel Extraction of Dataflow AI Accelerator Hardware Parameters
Authors: Guillaume Lomet, Ruben Salvador, Brice Colombier, Vincent Grosso, Olivier Sentieys, Cedric Killian,
Abstract summary: This paper proposes a methodology to recover the hardware configuration of dataflow accelerators generated with the FINN framework.<n>We demonstrate an attack phase requiring only 337 ms to recover the hardware parameters with an accuracy of more than 95% and 421 ms to fully recover these parameters.<n>This approach offers a more realistic attack scenario than existing methods, and compared to SoA attacks based on tsfresh, our method requires 940x and 110x less time for preparation and attack phases, respectively.
Score: 2.5118823309854323
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dataflow neural network accelerators efficiently process AI tasks on FPGAs, with deployment simplified by ready-to-use frameworks and pre-trained models. However, this convenience makes them vulnerable to malicious actors seeking to reverse engineer valuable Intellectual Property (IP) through Side-Channel Attacks (SCA). This paper proposes a methodology to recover the hardware configuration of dataflow accelerators generated with the FINN framework. Through unsupervised dimensionality reduction, we reduce the computational overhead compared to the state-of-the-art, enabling lightweight classifiers to recover both folding and quantization parameters. We demonstrate an attack phase requiring only 337 ms to recover the hardware parameters with an accuracy of more than 95% and 421 ms to fully recover these parameters with an averaging of 4 traces for a FINN-based accelerator running a CNN, both using a random forest classifier on side-channel traces, even with the accelerator dataflow fully loaded. This approach offers a more realistic attack scenario than existing methods, and compared to SoA attacks based on tsfresh, our method requires 940x and 110x less time for preparation and attack phases, respectively, and gives better results even without averaging traces.

Related papers

Blockchain-Enabled Routing for Zero-Trust Low-Altitude Intelligent Networks [77.17664010626726]
We focus on the routing with multiple UAV clusters in low-altitude intelligent networks (LAINs)<n>To minimize the damage caused by potential threats, we present the zero-trust architecture with the software-defined perimeter and blockchain techniques.<n>We show that the proposed framework reduces the average E2E delay by 59% and improves the TSR by 29% on average compared to benchmarks.
arXiv Detail & Related papers (2026-02-27T04:30:35Z)
Efficient Real-Time Adaptation of ROMs for Unsteady Flows Using Data Assimilation [7.958594167693376]
We propose an efficient retraining strategy for a parameterized Reduced Order Model (ROM)<n>The strategy attains accuracy comparable to full retraining while requiring only a fraction of the computational time.<n>We show that, for the dynamical system considered, the dominant source of error in out-of-sample forecasts stems from distortions of the latent manifold.
arXiv Detail & Related papers (2026-02-26T16:43:28Z)
Efficient Jailbreak Mitigation Using Semantic Linear Classification in a Multi-Staged Pipeline [1.2802720336459552]
Prompt injection and jailbreaking attacks pose persistent security challenges to large language model (LLM)-based systems.<n>We present an efficient and systematically evaluated defense architecture that mitigates these threats through a lightweight, multi-stage pipeline.
arXiv Detail & Related papers (2025-12-22T04:00:35Z)
ZeroLM: Data-Free Transformer Architecture Search for Language Models [54.83882149157548]
Current automated proxy discovery approaches suffer from extended search times, susceptibility to data overfitting, and structural complexity.<n>This paper introduces a novel zero-cost proxy methodology that quantifies model capacity through efficient weight statistics.<n>Our evaluation demonstrates the superiority of this approach, achieving a Spearman's rho of 0.76 and Kendall's tau of 0.53 on the FlexiBERT benchmark.
arXiv Detail & Related papers (2025-03-24T13:11:22Z)
AI-Accelerated Flow Simulation: A Robust Auto-Regressive Framework for Long-Term CFD Forecasting [2.3964255330849356]
We introduce the first implementation of the two-step derivative Adams-Bashforth method specifically tailored for data-driven AR prediction.<n>We develop three novel adaptive weighting strategies that dynamically adjust the importance of different future time steps.<n>Our framework accurately predicts 350 future steps reducing mean squared error from 0.125 to 0.002.
arXiv Detail & Related papers (2024-12-07T14:02:57Z)
Automatic Structured Pruning for Efficient Architecture in Federated Learning [5.300811350105823]
In Federated Learning (FL), training is conducted on client devices, typically with limited computational resources and storage capacity. We propose an automatic pruning scheme tailored for FL systems. Our solution improves efficiency on client devices, while minimizing communication costs.
arXiv Detail & Related papers (2024-11-04T02:52:02Z)
Automatic Generation of Fast and Accurate Performance Models for Deep Neural Network Accelerators [33.18173790144853]
We present an automated generation approach for fast performance models to accurately estimate the latency of a Deep Neural Networks (DNNs) We modeled representative DNN accelerators such as Gemmini, UltraTrail, Plasticine-derived, and a parameterizable systolic array. We evaluate only 154 loop kernel iterations to estimate the performance for 4.19 billion instructions achieving a significant speedup.
arXiv Detail & Related papers (2024-09-13T07:27:55Z)
Exploring Dynamic Transformer for Efficient Object Tracking [58.120191254379854]
We propose DyTrack, a dynamic transformer framework for efficient tracking.<n>DyTrack automatically learns to configure proper reasoning routes for various inputs, gaining better utilization of the available computational budget.<n>Experiments on multiple benchmarks demonstrate that DyTrack achieves promising speed-precision trade-offs with only a single model.
arXiv Detail & Related papers (2024-03-26T12:31:58Z)
Fast-NTK: Parameter-Efficient Unlearning for Large-Scale Models [17.34908967455907]
machine unlearning'' proposes the selective removal of unwanted data without the need for retraining from scratch. Fast-NTK is a novel NTK-based unlearning algorithm that significantly reduces the computational complexity.
arXiv Detail & Related papers (2023-12-22T18:55:45Z)
Data-Free Dynamic Compression of CNNs for Tractable Efficiency [46.498278084317704]
structured pruning approaches have shown promise in lowering floating-point operations without substantial drops in accuracy.<n>We propose HASTE (Hashing for Tractable Efficiency), a data-free, plug-and-play convolution module that instantly reduces a network's test-time inference cost without training or fine-tuning.<n>We demonstrate our approach on the popular vision benchmarks CIFAR-10 and ImageNet, where we achieve a 46.72% reduction in FLOPs with only a 1.25% loss in accuracy.
arXiv Detail & Related papers (2023-09-29T13:09:40Z)
Online Convolutional Re-parameterization [51.97831675242173]
We present online convolutional re- parameterization (OREPA), a two-stage pipeline, aiming to reduce the huge training overhead by squeezing the complex training-time block into a single convolution. Compared with the state-of-the-art re-param models, OREPA is able to save the training-time memory cost by about 70% and accelerate the training speed by around 2x. We also conduct experiments on object detection and semantic segmentation and show consistent improvements on the downstream tasks.
arXiv Detail & Related papers (2022-04-02T09:50:19Z)
Parameter Efficient Deep Probabilistic Forecasting [0.0]
We introduce a novel Bidirectional Temporal Convolutional Network (BiTCN), which requires an order of magnitude less parameters than a common Transformer-based approach. Our method performs on par with four state-of-the-art probabilistic forecasting methods, including a Transformer-based approach and WaveNet. We demonstrate that our method requires significantly less parameters than Transformer-based methods, which means the model can be trained faster with significantly lower memory requirements.
arXiv Detail & Related papers (2021-12-06T10:09:39Z)
FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation [81.76975488010213]
Dense optical flow estimation plays a key role in many robotic vision tasks. Current networks often occupy large number of parameters and require heavy computation costs. Our proposed FastFlowNet works in the well-known coarse-to-fine manner with following innovations.
arXiv Detail & Related papers (2021-03-08T03:09:37Z)
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes. Our goal is to misclassify a specific sample into a target class without any sample modification. By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z)
Non-Parametric Adaptive Network Pruning [125.4414216272874]
We introduce non-parametric modeling to simplify the algorithm design. Inspired by the face recognition community, we use a message passing algorithm to obtain an adaptive number of exemplars. EPruner breaks the dependency on the training data in determining the "important" filters.
arXiv Detail & Related papers (2021-01-20T06:18:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.