Activation Functions Considered Harmful: Recovering Neural Network Weights through Controlled Channels
- URL: http://arxiv.org/abs/2503.19142v1
- Date: Mon, 24 Mar 2025 20:55:18 GMT
- Title: Activation Functions Considered Harmful: Recovering Neural Network Weights through Controlled Channels
- Authors: Jesse Spielman, David Oswald, Mark Ryan, Jo Van Bulck,
- Abstract summary: Recent advancements in hardware-isolated enclaves, notably Intel SGX, hold the promise to secure the internal state of machine learning applications.<n>We show that privileged software adversaries can exploit input-dependent memory access patterns to extract secret weights and biases from an SGX enclave.
- Score: 3.806947766399214
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With high-stakes machine learning applications increasingly moving to untrusted end-user or cloud environments, safeguarding pre-trained model parameters becomes essential for protecting intellectual property and user privacy. Recent advancements in hardware-isolated enclaves, notably Intel SGX, hold the promise to secure the internal state of machine learning applications even against compromised operating systems. However, we show that privileged software adversaries can exploit input-dependent memory access patterns in common neural network activation functions to extract secret weights and biases from an SGX enclave. Our attack leverages the SGX-Step framework to obtain a noise-free, instruction-granular page-access trace. In a case study of an 11-input regression network using the Tensorflow Microlite library, we demonstrate complete recovery of all first-layer weights and biases, as well as partial recovery of parameters from deeper layers under specific conditions. Our novel attack technique requires only 20 queries per input per weight to obtain all first-layer weights and biases with an average absolute error of less than 1%, improving over prior model stealing attacks. Additionally, a broader ecosystem analysis reveals the widespread use of activation functions with input-dependent memory access patterns in popular machine learning frameworks (either directly or via underlying math libraries). Our findings highlight the limitations of deploying confidential models in SGX enclaves and emphasise the need for stricter side-channel validation of machine learning implementations, akin to the vetting efforts applied to secure cryptographic libraries.
Related papers
- Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable [70.77600345240867]
A novel arbitrary-in-arbitrary-out (AIAO) strategy makes watermarks resilient to fine-tuning-based removal.
Unlike the existing methods of designing a backdoor for the input/output space of diffusion models, in our method, we propose to embed the backdoor into the feature space of sampled subpaths.
Our empirical studies on the MS-COCO, AFHQ, LSUN, CUB-200, and DreamBooth datasets confirm the robustness of AIAO.
arXiv Detail & Related papers (2024-05-01T12:03:39Z) - FaultGuard: A Generative Approach to Resilient Fault Prediction in Smart Electrical Grids [53.2306792009435]
FaultGuard is the first framework for fault type and zone classification resilient to adversarial attacks.
We propose a low-complexity fault prediction model and an online adversarial training technique to enhance robustness.
Our model outclasses the state-of-the-art for resilient fault prediction benchmarking, with an accuracy of up to 0.958.
arXiv Detail & Related papers (2024-03-26T08:51:23Z) - Hawk: Accurate and Fast Privacy-Preserving Machine Learning Using Secure Lookup Table Computation [11.265356632908846]
Training machine learning models on data from multiple entities without direct data sharing can unlock applications otherwise hindered by business, legal, or ethical constraints.
We design and implement new privacy-preserving machine learning protocols for logistic regression and neural network models.
Our evaluations show that our logistic regression protocol is up to 9x faster, and the neural network training is up to 688x faster than SecureML.
arXiv Detail & Related papers (2024-03-26T00:51:12Z) - On-device Self-supervised Learning of Visual Perception Tasks aboard
Hardware-limited Nano-quadrotors [53.59319391812798]
Sub-SI50gram nano-drones are gaining momentum in both academia and industry.
Their most compelling applications rely on onboard deep learning models for perception.
When deployed in unknown environments, these models often underperform due to domain shift.
We propose for the first time, on-device learning aboard nano-drones, where the first part of the in-field mission is dedicated to self-supervised fine-tuning.
arXiv Detail & Related papers (2024-03-06T22:04:14Z) - An anomaly detection approach for backdoored neural networks: face
recognition as a case study [77.92020418343022]
We propose a novel backdoored network detection method based on the principle of anomaly detection.
We test our method on a novel dataset of backdoored networks and report detectability results with perfect scores.
arXiv Detail & Related papers (2022-08-22T12:14:13Z) - On the Evaluation of User Privacy in Deep Neural Networks using Timing
Side Channel [14.350301915592027]
We identify and report a novel data-dependent timing side-channel leakage (termed Class Leakage) in Deep Learning (DL) implementations.
We demonstrate a practical inference-time attack where an adversary with user privilege and hard-label blackbox access to an ML can exploit Class Leakage.
We develop an easy-to-implement countermeasure by making a constant-time branching operation that alleviates the Class Leakage.
arXiv Detail & Related papers (2022-08-01T19:38:16Z) - DeepPayload: Black-box Backdoor Attack on Deep Learning Models through
Neural Payload Injection [17.136757440204722]
We introduce a highly practical backdoor attack achieved with a set of reverse-engineering techniques over compiled deep learning models.
The injected backdoor can be triggered with a success rate of 93.5%, while only brought less than 2ms latency overhead and no more than 1.4% accuracy decrease.
We found 54 apps that were vulnerable to our attack, including popular and security-critical ones.
arXiv Detail & Related papers (2021-01-18T06:29:30Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z) - Dynamic Backdoor Attacks Against Machine Learning Models [28.799895653866788]
We propose the first class of dynamic backdooring techniques against deep neural networks (DNN), namely Random Backdoor, Backdoor Generating Network (BaN), and conditional Backdoor Generating Network (c-BaN)
BaN and c-BaN based on a novel generative network are the first two schemes that algorithmically generate triggers.
Our techniques achieve almost perfect attack performance on backdoored data with a negligible utility loss.
arXiv Detail & Related papers (2020-03-07T22:46:51Z) - Automatic Perturbation Analysis for Scalable Certified Robustness and
Beyond [171.07853346630057]
Linear relaxation based perturbation analysis (LiRPA) for neural networks has become a core component in robustness verification and certified defense.
We develop an automatic framework to enable perturbation analysis on any neural network structures.
We demonstrate LiRPA based certified defense on Tiny ImageNet and Downscaled ImageNet.
arXiv Detail & Related papers (2020-02-28T18:47:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.