Practical and Private Hybrid ML Inference with Fully Homomorphic Encryption
- URL: http://arxiv.org/abs/2509.01253v1
- Date: Mon, 01 Sep 2025 08:43:46 GMT
- Title: Practical and Private Hybrid ML Inference with Fully Homomorphic Encryption
- Authors: Sayan Biswas, Philippe Chartier, Akash Dhasade, Tom Jurien, David Kerriou, Anne-Marie Kerrmarec, Mohammed Lemou, Franklin Tranie, Martijn de Vos, Milos Vujasinovic,
- Abstract summary: Safhire is a hybrid inference framework that executes linear layers under encryption on the server.<n>It supports exact activations and significantly reduces computation.<n>It achieves 1.5X - 10.5X lower inference latency than Orion.
- Score: 0.34953784594970894
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In contemporary cloud-based services, protecting users' sensitive data and ensuring the confidentiality of the server's model are critical. Fully homomorphic encryption (FHE) enables inference directly on encrypted inputs, but its practicality is hindered by expensive bootstrapping and inefficient approximations of non-linear activations. We introduce Safhire, a hybrid inference framework that executes linear layers under encryption on the server while offloading non-linearities to the client in plaintext. This design eliminates bootstrapping, supports exact activations, and significantly reduces computation. To safeguard model confidentiality despite client access to intermediate outputs, Safhire applies randomized shuffling, which obfuscates intermediate values and makes it practically impossible to reconstruct the model. To further reduce latency, Safhire incorporates advanced optimizations such as fast ciphertext packing and partial extraction. Evaluations on multiple standard models and datasets show that Safhire achieves 1.5X - 10.5X lower inference latency than Orion, a state-of-the-art baseline, with manageable communication overhead and comparable accuracy, thereby establishing the practicality of hybrid FHE inference.
Related papers
- Your Inference Request Will Become a Black Box: Confidential Inference for Cloud-based Large Language Models [39.390624817461905]
Talaria is a confidential inference framework that partitions the Large Language Models pipeline to protect client data.<n>Talaria executes sensitive, weight-independent operations within a client-controlled Confidential Virtual Machine.<n>Talaria can defend against state-of-the-art token inference attacks, reducing token reconstruction accuracy from over 97.5% to an average of 1.34%.
arXiv Detail & Related papers (2026-02-27T06:37:07Z) - Anchored Decoding: Provably Reducing Copyright Risk for Any Language Model [99.16364381244445]
Modern language models (LMs) tend to memorize portions of their training data and emit verbatim spans.<n>We propose Anchored Decoding, a plug-and-play inference-time method for suppressing verbatim copying.<n>We evaluate our methods across six model pairs on long-form evaluations of copyright risk and utility.
arXiv Detail & Related papers (2026-02-06T19:00:14Z) - NeuroFilter: Privacy Guardrails for Conversational LLM Agents [50.75206727081996]
This work addresses the computational challenge of enforcing privacy for agentic Large Language Models (LLMs)<n>NeuroFilter is a guardrail framework that operationalizes contextual integrity by mapping norm violations to simple directions in the model's activation space.<n>A comprehensive evaluation across over 150,000 interactions, covering models from 7B to 70B parameters, illustrates the strong performance of NeuroFilter.
arXiv Detail & Related papers (2026-01-21T05:16:50Z) - MirageNet:A Secure, Efficient, and Scalable On-Device Model Protection in Heterogeneous TEE and GPU System [8.936421130707204]
Given high model training costs and user experience requirements, balancing model privacy and low runtime overhead is critical.<n>We propose ConvShatter, a novel obfuscation scheme that achieves low latency and high accuracy while preserving model confidentiality and integrity.
arXiv Detail & Related papers (2026-01-20T10:39:09Z) - AmbShield: Enhancing Physical Layer Security with Ambient Backscatter Devices against Eavesdroppers [69.56534335936534]
AmbShield is an AmBD-assisted PLS scheme that leverages naturally distributed AmBDs to simultaneously strengthen the legitimate channel and degrade eavesdroppers'<n>In AmbShield, AmBDs are exploited as friendly jammers that randomly backscatter to create interference at eavesdroppers, and as passive relays that backscatter the desired signal to enhance the capacity of legitimate devices.
arXiv Detail & Related papers (2026-01-14T20:56:50Z) - Design and Optimization of Cloud Native Homomorphic Encryption Workflows for Privacy-Preserving ML Inference [0.0]
Homomorphic Encryption (HE) has emerged as a compelling technique that enables cryptographic computation on encrypted data.<n>The integration of HE within large scale cloud native pipelines remains constrained by high computational overhead, orchestration complexity, and model compatibility issues.<n>This paper presents a systematic framework for the design and optimization of cloud native homomorphic encryption that support privacy ML inference.
arXiv Detail & Related papers (2025-10-28T15:13:32Z) - SDGO: Self-Discrimination-Guided Optimization for Consistent Safety in Large Language Models [48.62892618209157]
Large Language Models (LLMs) excel at various natural language processing tasks but remain vulnerable to jailbreaking attacks.<n>This paper explores aligning the model's inherent discrimination and generation capabilities.<n>Our method does not require any additional annotated data or external models during the training phase.
arXiv Detail & Related papers (2025-08-21T15:26:09Z) - Efficient Privacy-Preserving Cross-Silo Federated Learning with Multi-Key Homomorphic Encryption [7.332140296779856]
Federated Learning (FL) is susceptible to privacy attacks.<n>Recent studies combined Multi-Key Homomorphic Encryption (MKHE) and FL.<n>We propose MASER, an efficient MKHE-based Privacy-Preserving FL framework.
arXiv Detail & Related papers (2025-05-20T18:08:15Z) - Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems [89.35169042718739]
collaborative inference enables end users to leverage powerful deep learning models without exposure of sensitive raw data to cloud servers.<n>Recent studies have revealed that these intermediate features may not sufficiently preserve privacy, as information can be leaked and raw data can be reconstructed via model inversion attacks (MIAs)<n>This work first theoretically proves that the conditional entropy of inputs given intermediate features provides a guaranteed lower bound on the reconstruction mean square error (MSE) under any MIA.<n>Then, we derive a differentiable and solvable measure for bounding this conditional entropy based on the Gaussian mixture estimation and propose a conditional entropy algorithm to enhance the inversion robustness
arXiv Detail & Related papers (2025-03-01T07:15:21Z) - A Selective Homomorphic Encryption Approach for Faster Privacy-Preserving Federated Learning [2.942616054218564]
Federated learning (FL) has come forward as a critical approach for privacy-preserving machine learning in healthcare.<n>Current security implementations for these systems face a fundamental trade-off: rigorous cryptographic protections impose prohibitive computational overhead.<n>We present Fast and Secure Federated Learning, a novel approach that strategically combines selective homomorphic encryption, differential privacy, and bitwise scrambling to achieve robust security.
arXiv Detail & Related papers (2025-01-22T14:37:44Z) - FheFL: Fully Homomorphic Encryption Friendly Privacy-Preserving Federated Learning with Byzantine Users [19.209830150036254]
federated learning (FL) technique was developed to mitigate data privacy issues in the traditional machine learning paradigm.
Next-generation FL architectures proposed encryption and anonymization techniques to protect the model updates from the server.
This paper proposes a novel FL algorithm based on a fully homomorphic encryption (FHE) scheme.
arXiv Detail & Related papers (2023-06-08T11:20:00Z) - Over-the-Air Federated Learning with Privacy Protection via Correlated
Additive Perturbations [57.20885629270732]
We consider privacy aspects of wireless federated learning with Over-the-Air (OtA) transmission of gradient updates from multiple users/agents to an edge server.
Traditional perturbation-based methods provide privacy protection while sacrificing the training accuracy.
In this work, we aim at minimizing privacy leakage to the adversary and the degradation of model accuracy at the edge server.
arXiv Detail & Related papers (2022-10-05T13:13:35Z) - THE-X: Privacy-Preserving Transformer Inference with Homomorphic
Encryption [112.02441503951297]
Privacy-preserving inference of transformer models is on the demand of cloud service users.
We introduce $textitTHE-X$, an approximation approach for transformers, which enables privacy-preserving inference of pre-trained models.
arXiv Detail & Related papers (2022-06-01T03:49:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.