Towards Secure and Efficient DNN Accelerators via Hardware-Software Co-Design
- URL: http://arxiv.org/abs/2602.20521v1
- Date: Tue, 24 Feb 2026 03:49:12 GMT
- Title: Towards Secure and Efficient DNN Accelerators via Hardware-Software Co-Design
- Authors: Wei Xuan, Zihao Xuan, Rongliang Fu, Ning Lin, Kwunhang Wong, Zikang Yuan, Lang Feng, Zhongrui Wang, Tsung-Yi Ho, Yuzhong Jiao, Luhong Liang,
- Abstract summary: This paper presents a secure and efficient memory protection framework for deep neural network (DNN) accelerators with minimal overhead.<n>First, we propose a bandwidth-aware cryptographic scheme that adapts encryption granularity based on memory traffic patterns.<n>Second, we observe that both the overlapping regions in the intra-layer's sliding window pattern and those resulting from inter-layer tiling strategy discrepancies introduce substantial redundant memory accesses.<n>Third, we introduce a multi-level authentication mechanism that effectively eliminates unnecessary off-chip memory accesses, enhancing performance and energy efficiency.
- Score: 18.80243704372307
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid deployment of deep neural network (DNN) accelerators in safety-critical domains such as autonomous vehicles, healthcare systems, and financial infrastructure necessitates robust mechanisms to safeguard data confidentiality and computational integrity. Existing security solutions for DNN accelerators, however, suffer from excessive hardware resource demands and frequent off-chip memory access overheads, which degrade performance and scalability. To address these challenges, this paper presents a secure and efficient memory protection framework for DNN accelerators with minimal overhead. First, we propose a bandwidth-aware cryptographic scheme that adapts encryption granularity based on memory traffic patterns, striking a balance between security and resource efficiency. Second, we observe that both the overlapping regions in the intra-layer tiling's sliding window pattern and those resulting from inter-layer tiling strategy discrepancies introduce substantial redundant memory accesses and repeated computational overhead in cryptography. Third, we introduce a multi-level authentication mechanism that effectively eliminates unnecessary off-chip memory accesses, enhancing performance and energy efficiency. Experimental results show that this work decreases performance overhead by over 12% and achieves 87% energy efficiency improvement for both server and edge neural processing units (NPUs), while ensuring robust scalability.
Related papers
- When Small Variations Become Big Failures: Reliability Challenges in Compute-in-Memory Neural Accelerators [10.520875052654771]
We show that even small device variations can lead to disproportionately high accuracy degradation and catastrophic failures in safety-critical inference workloads.<n>We introduce SWIM, a selective write-verify mechanism that strategically applies verification only where it is most impactful.<n>Finally, we explore a learning-centric solution that improves realistic worst-case performance by training networks with right-censored noise.
arXiv Detail & Related papers (2026-03-03T20:08:25Z) - Blockchain-Enabled Routing for Zero-Trust Low-Altitude Intelligent Networks [77.17664010626726]
We focus on the routing with multiple UAV clusters in low-altitude intelligent networks (LAINs)<n>To minimize the damage caused by potential threats, we present the zero-trust architecture with the software-defined perimeter and blockchain techniques.<n>We show that the proposed framework reduces the average E2E delay by 59% and improves the TSR by 29% on average compared to benchmarks.
arXiv Detail & Related papers (2026-02-27T04:30:35Z) - Designing a Layered Framework to Secure Data via Improved Multi Stage Lightweight Cryptography in IoT Cloud Systems [1.5803208833562954]
This paper presents a novel multi-layered hybrid security approach aimed at enhancing lightweight encryption for IoT-Cloud systems.<n>The proposed framework consists of three core layers: (1) the H.E.EZ Layer which integrates improved versions of Hyperledger Fabric, Enc-Block and a hybrid ECDSA-ZSS scheme to improve encryption speed, scalability and reduce computational cost; (2) the Credential Management Layer independently verifying data authenticity and authenticity; and (3) the Time and Auditing Layer designed to reduce traffic overhead and optimize performance across dynamic workloads.
arXiv Detail & Related papers (2025-09-01T18:53:20Z) - Secure Resource Allocation via Constrained Deep Reinforcement Learning [49.15061461220109]
We present SARMTO, a framework that balances resource allocation, task offloading, security, and performance.<n>SARMTO consistently outperforms five baseline approaches, achieving up to a 40% reduction in system costs.<n>These enhancements highlight SARMTO's potential to revolutionize resource management in intricate distributed computing environments.
arXiv Detail & Related papers (2025-01-20T15:52:43Z) - Digital Twin-Assisted Data-Driven Optimization for Reliable Edge Caching in Wireless Networks [60.54852710216738]
We introduce a novel digital twin-assisted optimization framework, called D-REC, to ensure reliable caching in nextG wireless networks.
By incorporating reliability modules into a constrained decision process, D-REC can adaptively adjust actions, rewards, and states to comply with advantageous constraints.
arXiv Detail & Related papers (2024-06-29T02:40:28Z) - Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA [20.629635991749808]
This paper proposes an algorithm and hardware co-design framework that can generate field-programmable gate array (FPGA)-based accelerators for efficient BayesNNs.
At the algorithm level, we propose novel multi-exit dropout-based BayesNNs with reduced computational and memory overheads.
At the hardware level, this paper introduces a transformation framework that can generate FPGA-based accelerators for the proposed efficient BayesNNs.
arXiv Detail & Related papers (2024-06-20T17:08:42Z) - DNNShield: Embedding Identifiers for Deep Neural Network Ownership Verification [46.47446944218544]
This paper introduces DNNShield, a novel approach for protection of Deep Neural Networks (DNNs)
DNNShield embeds unique identifiers within the model architecture using specialized protection layers.
We validate the effectiveness and efficiency of DNNShield through extensive evaluations across three datasets and four model architectures.
arXiv Detail & Related papers (2024-03-11T10:27:36Z) - Exploration of Activation Fault Reliability in Quantized Systolic
Array-Based DNN Accelerators [0.8796261172196743]
This paper presents a comprehensive methodology for exploring and enabling a holistic assessment of the impact of quantization on model accuracy, activation fault reliability, and hardware efficiency.
A fully automated framework is introduced that is capable of applying various quantization-aware techniques, fault injection, and hardware implementation.
The experiments on established benchmarks demonstrate the analysis flow and the profound implications of quantization on reliability, hardware performance, and network accuracy.
arXiv Detail & Related papers (2024-01-17T12:55:17Z) - RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation
Based Private Inference [17.299835585861747]
We introduce RRNet, a framework that aims to jointly reduce the overhead of MPC comparison protocols and accelerate computation through hardware acceleration.
Our approach integrates the hardware latency of cryptographic building blocks into the DNN loss function, resulting in improved energy efficiency, accuracy, and security guarantees.
arXiv Detail & Related papers (2023-02-05T04:02:13Z) - PolyMPCNet: Towards ReLU-free Neural Architecture Search in Two-party
Computation Based Private Inference [23.795457990555878]
Secure multi-party computation (MPC) has been discussed, to enable the privacy-preserving deep learning (DL) computation.
MPCs often come at very high computation overhead, and potentially prohibit their popularity in large scale systems.
In this work, we develop a systematic framework, PolyMPCNet, of joint overhead reduction of MPC comparison protocol and hardware acceleration.
arXiv Detail & Related papers (2022-09-20T02:47:37Z) - Artificial Intelligence Empowered Multiple Access for Ultra Reliable and
Low Latency THz Wireless Networks [76.89730672544216]
Terahertz (THz) wireless networks are expected to catalyze the beyond fifth generation (B5G) era.
To satisfy the ultra-reliability and low-latency demands of several B5G applications, novel mobility management approaches are required.
This article presents a holistic MAC layer approach that enables intelligent user association and resource allocation, as well as flexible and adaptive mobility management.
arXiv Detail & Related papers (2022-08-17T03:00:24Z) - Distributed Reinforcement Learning for Privacy-Preserving Dynamic Edge
Caching [91.50631418179331]
A privacy-preserving distributed deep policy gradient (P2D3PG) is proposed to maximize the cache hit rates of devices in the MEC networks.
We convert the distributed optimizations into model-free Markov decision process problems and then introduce a privacy-preserving federated learning method for popularity prediction.
arXiv Detail & Related papers (2021-10-20T02:48:27Z) - Towards Energy-Efficient and Secure Edge AI: A Cross-Layer Framework [13.573645522781712]
Deep neural networks (DNNs) and spiking neural networks (SNNs) offer state-of-the-art results on resource-constrained edge devices.
These systems are required to maintain correct functionality under diverse security and reliability threats.
This paper first discusses existing approaches to address energy efficiency, reliability, and security issues at different system layers.
arXiv Detail & Related papers (2021-09-20T20:22:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.