MirageNet:A Secure, Efficient, and Scalable On-Device Model Protection in Heterogeneous TEE and GPU System
- URL: http://arxiv.org/abs/2601.13826v1
- Date: Tue, 20 Jan 2026 10:39:09 GMT
- Title: MirageNet:A Secure, Efficient, and Scalable On-Device Model Protection in Heterogeneous TEE and GPU System
- Authors: Huadi Zheng, Li Cheng, Yan Ding,
- Abstract summary: Given high model training costs and user experience requirements, balancing model privacy and low runtime overhead is critical.<n>We propose ConvShatter, a novel obfuscation scheme that achieves low latency and high accuracy while preserving model confidentiality and integrity.
- Score: 8.936421130707204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As edge devices gain stronger computing power, deploying high-performance DNN models on untrusted hardware has become a practical approach to cut inference latency and protect user data privacy. Given high model training costs and user experience requirements, balancing model privacy and low runtime overhead is critical. TEEs offer a viable defense, and prior work has proposed heterogeneous GPU-TEE inference frameworks via parameter obfuscation to balance efficiency and confidentiality. However, recent studies find partial obfuscation defenses ineffective, while robust schemes cause unacceptable latency. To resolve these issues, we propose ConvShatter, a novel obfuscation scheme that achieves low latency and high accuracy while preserving model confidentiality and integrity. It leverages convolution linearity to decompose kernels into critical and common ones, inject confounding decoys, and permute channel/kernel orders. Pre-deployment, it performs kernel decomposition, decoy injection and order obfuscation, storing minimal recovery parameters securely in the TEE. During inference, the TEE reconstructs outputs of obfuscated convolutional layers. Extensive experiments show ConvShatter substantially reduces latency overhead with strong security guarantees; versus comparable schemes, it cuts overhead by 16% relative to GroupCover while maintaining accuracy on par with the original model.
Related papers
- AmbShield: Enhancing Physical Layer Security with Ambient Backscatter Devices against Eavesdroppers [69.56534335936534]
AmbShield is an AmBD-assisted PLS scheme that leverages naturally distributed AmBDs to simultaneously strengthen the legitimate channel and degrade eavesdroppers'<n>In AmbShield, AmBDs are exploited as friendly jammers that randomly backscatter to create interference at eavesdroppers, and as passive relays that backscatter the desired signal to enhance the capacity of legitimate devices.
arXiv Detail & Related papers (2026-01-14T20:56:50Z) - Optimistic TEE-Rollups: A Hybrid Architecture for Scalable and Verifiable Generative AI Inference on Blockchain [4.254924788681319]
We introduce Optimistic TEE-Rollups (OTR), a hybrid verification protocol that harmonizes constraints.<n>OTR achieves 99% of the throughput of centralized baselines with a marginal cost overhead of $0.07 per query.
arXiv Detail & Related papers (2025-12-23T09:16:41Z) - Amulet: Fast TEE-Shielded Inference for On-Device Model Protection [15.936694312917512]
On-device machine learning (ML) introduces new security concerns about model privacy.<n> Storing valuable trained ML models on user devices exposes them to potential extraction by adversaries.<n>We propose Amulet, a fast TEE-shielded on-device inference framework for ML model protection.
arXiv Detail & Related papers (2025-12-08T12:22:51Z) - Towards Confidential and Efficient LLM Inference with Dual Privacy Protection [11.22744810136105]
This paper proposes CMIF, a Confidential and efficient Model Inference Framework.<n> CMIF confidentially deploys the embedding layer in the client-side TEE and subsequent layers on GPU servers.<n>It optimize the Report-Noisy-Max mechanism to protect sensitive inputs with a slight decrease in model performance.
arXiv Detail & Related papers (2025-09-11T01:54:13Z) - Practical and Private Hybrid ML Inference with Fully Homomorphic Encryption [0.34953784594970894]
Safhire is a hybrid inference framework that executes linear layers under encryption on the server.<n>It supports exact activations and significantly reduces computation.<n>It achieves 1.5X - 10.5X lower inference latency than Orion.
arXiv Detail & Related papers (2025-09-01T08:43:46Z) - MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models [56.09354775405601]
Model extraction attacks aim to replicate the functionality of a black-box model through query access.<n>Most existing defenses presume that attacker queries have out-of-distribution (OOD) samples, enabling them to detect and disrupt suspicious inputs.<n>We propose MISLEADER, a novel defense strategy that does not rely on OOD assumptions.
arXiv Detail & Related papers (2025-06-03T01:37:09Z) - Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems [89.35169042718739]
collaborative inference enables end users to leverage powerful deep learning models without exposure of sensitive raw data to cloud servers.<n>Recent studies have revealed that these intermediate features may not sufficiently preserve privacy, as information can be leaked and raw data can be reconstructed via model inversion attacks (MIAs)<n>This work first theoretically proves that the conditional entropy of inputs given intermediate features provides a guaranteed lower bound on the reconstruction mean square error (MSE) under any MIA.<n>Then, we derive a differentiable and solvable measure for bounding this conditional entropy based on the Gaussian mixture estimation and propose a conditional entropy algorithm to enhance the inversion robustness
arXiv Detail & Related papers (2025-03-01T07:15:21Z) - MirrorNet: A TEE-Friendly Framework for Secure On-device DNN Inference [14.08010398777227]
Deep neural network (DNN) models have become prevalent in edge devices for real-time inference.
Existing defense approaches fail to fully safeguard model confidentiality or result in significant latency issues.
This paper presents MirrorNet, which generates a TEE-friendly implementation for any given DNN model to protect the model confidentiality.
For the evaluation, MirrorNet can achieve a 18.6% accuracy gap between authenticated and illegal use, while only introducing 0.99% hardware overhead.
arXiv Detail & Related papers (2023-11-16T01:21:19Z) - Over-the-Air Federated Learning with Privacy Protection via Correlated
Additive Perturbations [57.20885629270732]
We consider privacy aspects of wireless federated learning with Over-the-Air (OtA) transmission of gradient updates from multiple users/agents to an edge server.
Traditional perturbation-based methods provide privacy protection while sacrificing the training accuracy.
In this work, we aim at minimizing privacy leakage to the adversary and the degradation of model accuracy at the edge server.
arXiv Detail & Related papers (2022-10-05T13:13:35Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z) - A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration
Framework [56.57225686288006]
Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices.
Previous pruning methods mainly focus on reducing the model size and/or improving performance without considering the privacy of user data.
We propose a privacy-preserving-oriented pruning and mobile acceleration framework that does not require the private training dataset.
arXiv Detail & Related papers (2020-03-13T23:52:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.