Related papers: Efficient and Encrypted Inference using Binarized Neural Networks within In-Memory Computing Architectures

Efficient and Encrypted Inference using Binarized Neural Networks within In-Memory Computing Architectures

URL: http://arxiv.org/abs/2510.23034v1
Date: Mon, 27 Oct 2025 05:59:02 GMT
Title: Efficient and Encrypted Inference using Binarized Neural Networks within In-Memory Computing Architectures
Authors: Gokulnath Rajendran, Suman Deb, Anupam Chattopadhyay,
Abstract summary: Binarized Neural Networks (BNNs) are a class of deep neural networks designed to utilize minimal computational resources.<n>Recent studies highlight the potential of mapping BNN model parameters onto emerging non-volatile memory technologies.<n>However, protecting model parameters from theft attacks by storing them in an encrypted format and decrypting them at runtime introduces significant computational overhead.
Score: 2.5756681494057045
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Binarized Neural Networks (BNNs) are a class of deep neural networks designed to utilize minimal computational resources, which drives their popularity across various applications. Recent studies highlight the potential of mapping BNN model parameters onto emerging non-volatile memory technologies, specifically using crossbar architectures, resulting in improved inference performance compared to traditional CMOS implementations. However, the common practice of protecting model parameters from theft attacks by storing them in an encrypted format and decrypting them at runtime introduces significant computational overhead, thus undermining the core principles of in-memory computing, which aim to integrate computation and storage. This paper presents a robust strategy for protecting BNN model parameters, particularly within in-memory computing frameworks. Our method utilizes a secret key derived from a physical unclonable function to transform model parameters prior to storage in the crossbar. Subsequently, the inference operations are performed on the encrypted weights, achieving a very special case of Fully Homomorphic Encryption (FHE) with minimal runtime overhead. Our analysis reveals that inference conducted without the secret key results in drastically diminished performance, with accuracy falling below 15%. These results validate the effectiveness of our protection strategy in securing BNNs within in-memory computing architectures while preserving computational efficiency.

Related papers

Privacy-Preserving Spiking Neural Networks: A Deep Dive into Encryption Parameter Optimisation [0.0]
Spiking Neural Networks (SNNs) mimic the brain's event-driven behaviour, offering improved performance and reduced power use.<n>BioEncryptSNN is a spiking neural network based encryption-decryption framework for secure and noise-resilient data protection.
arXiv Detail & Related papers (2025-10-22T12:43:46Z)
Hardware-software co-exploration with racetrack memory based in-memory computing for CNN inference in embedded systems [54.045712360156024]
racetrack memory is a non-volatile technology that allows high data density fabrication.<n>In-memory arithmetic circuits with memory cells affects both the memory density and power efficiency.<n>We present an efficient in-memory convolutional neural network (CNN) accelerator optimized for use with racetrack memory.
arXiv Detail & Related papers (2025-07-02T07:29:53Z)
Quantifying Memory Utilization with Effective State-Size [73.52115209375343]
We develop a measure of textitmemory utilization'<n>This metric is tailored to the fundamental class of systems with textitinput-invariant and textitinput-varying linear operators
arXiv Detail & Related papers (2025-04-28T08:12:30Z)
Encrypted Large Model Inference: The Equivariant Encryption Paradigm [18.547945807599543]
We introduce Equivariant Encryption (EE), a novel paradigm designed to enable secure, "blind" inference on encrypted data with near zero performance overhead.<n>Unlike fully homomorphic approaches that encrypt the entire computational graph, EE selectively obfuscates critical internal representations within neural network layers.<n>EE maintains high fidelity and throughput, effectively bridging the gap between robust data confidentiality and the stringent efficiency requirements of modern, large scale model inference.
arXiv Detail & Related papers (2025-02-03T03:05:20Z)
Pruning random resistive memory for optimizing analogue AI [54.21621702814583]
AI models present unprecedented challenges to energy consumption and environmental sustainability. One promising solution is to revisit analogue computing, a technique that predates digital computing. Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning.
arXiv Detail & Related papers (2023-11-13T08:59:01Z)
Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks. By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead. We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z)
Efficient Privacy-Preserving Convolutional Spiking Neural Networks with FHE [1.437446768735628]
Homomorphic Encryption (FHE) is a key technology for privacy-preserving computation. FHE has limitations in processing continuous non-polynomial functions. We present a framework called FHE-DiCSNN for homomorphic SNNs. FHE-DiCSNN achieves an accuracy of 97.94% on ciphertexts, with a loss of only 0.53% compared to the original network's accuracy of 98.47%.
arXiv Detail & Related papers (2023-09-16T15:37:18Z)
Comparative Analysis of Interval Reachability for Robust Implicit and Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs) INNs are a class of implicit learning models that use implicit equations as layers. We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z)
Enabling Homomorphically Encrypted Inference for Large DNN Models [1.0679692136113117]
Homomorphic encryption (HE) enables inference using encrypted data but it incurs 100x--10,000x memory and runtime overheads. Secure deep neural network (DNN) inference using HE is currently limited by computing and memory resources. We explore the feasibility of leveraging hybrid memory systems comprised of DRAM and persistent memory.
arXiv Detail & Related papers (2021-03-30T07:53:34Z)
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes. Our goal is to misclassify a specific sample into a target class without any sample modification. By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z)
Neural Network Compression for Noisy Storage Devices [71.4102472611862]
Conventionally, model compression and physical storage are decoupled. This approach forces the storage to treat each bit of the compressed model equally, and to dedicate the same amount of resources to each bit. We propose a radically different approach that: (i) employs analog memories to maximize the capacity of each memory cell, and (ii) jointly optimize model compression and physical storage to maximize memory utility.
arXiv Detail & Related papers (2021-02-15T18:19:07Z)
Efficient Computation Reduction in Bayesian Neural Networks Through Feature Decomposition and Memorization [10.182119276564643]
In this paper, an efficient BNN inference flow is proposed to reduce the computation cost. About half of the computations could be eliminated compared to the traditional approach. We implement our approach in Verilog and synthesise it with 45 $nm$ FreePDK technology.
arXiv Detail & Related papers (2020-05-08T05:03:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.