Related papers: $\Lambda$-Split: A Privacy-Preserving Split Computing Framework for Cloud-Powered Generative AI

$\Lambda$-Split: A Privacy-Preserving Split Computing Framework for Cloud-Powered Generative AI

URL: http://arxiv.org/abs/2310.14651v1
Date: Mon, 23 Oct 2023 07:44:04 GMT
Title: $\Lambda$-Split: A Privacy-Preserving Split Computing Framework for Cloud-Powered Generative AI
Authors: Shoki Ohta, Takayuki Nishio
Abstract summary: We introduce $Lambda$-Split, a split computing framework to facilitate computational offloading. In $Lambda$-Split, a generative model, usually a deep neural network (DNN), is partitioned into three sub-models and distributed across the user's local device and a cloud server. This architecture ensures that only the hidden layer outputs are transmitted, thereby preventing the external transmission of privacy-sensitive raw input and output data.
Score: 3.363904632882723
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the wake of the burgeoning expansion of generative artificial intelligence (AI) services, the computational demands inherent to these technologies frequently necessitate cloud-powered computational offloading, particularly for resource-constrained mobile devices. These services commonly employ prompts to steer the generative process, and both the prompts and the resultant content, such as text and images, may harbor privacy-sensitive or confidential information, thereby elevating security and privacy risks. To mitigate these concerns, we introduce $\Lambda$-Split, a split computing framework to facilitate computational offloading while simultaneously fortifying data privacy against risks such as eavesdropping and unauthorized access. In $\Lambda$-Split, a generative model, usually a deep neural network (DNN), is partitioned into three sub-models and distributed across the user's local device and a cloud server: the input-side and output-side sub-models are allocated to the local, while the intermediate, computationally-intensive sub-model resides on the cloud server. This architecture ensures that only the hidden layer outputs are transmitted, thereby preventing the external transmission of privacy-sensitive raw input and output data. Given the black-box nature of DNNs, estimating the original input or output from intercepted hidden layer outputs poses a significant challenge for malicious eavesdroppers. Moreover, $\Lambda$-Split is orthogonal to traditional encryption-based security mechanisms, offering enhanced security when deployed in conjunction. We empirically validate the efficacy of the $\Lambda$-Split framework using Llama 2 and Stable Diffusion XL, representative large language and diffusion models developed by Meta and Stability AI, respectively. Our $\Lambda$-Split implementation is publicly accessible at https://github.com/nishio-laboratory/lambda_split.

Related papers

SecONNds: Secure Outsourced Neural Network Inference on ImageNet [0.0]
We introduce SecONNds, a non-intrusive secure inference framework optimized for large ImageNet-scale Convolutional Neural Networks.<n>Our novel protocol achieves an online speedup of 17$times$ in nonlinear operations compared to state-of-the-art solutions.<n>We also present SecONNds-P, a bit-exact variant that ensures verifiable full-precision results in secure computation.
arXiv Detail & Related papers (2025-06-13T08:49:39Z)
Secure Multiparty Generative AI [1.4433703131122861]
As usage of generative AI tools skyrockets, the amount of sensitive information being exposed to these models is alarming. In our research, we present a secure and private methodology for generative artificial intelligence that does not expose sensitive data or models to third-party AI providers.
arXiv Detail & Related papers (2024-09-27T19:55:49Z)
CURE: Privacy-Preserving Split Learning Done Right [1.388112207221632]
Homomorphic encryption (HE)-based solutions exist for this scenario but often impose prohibitive computational burdens. CURE is a novel system that encrypts only the server side of the model and the data. We demonstrate CURE can achieve similar accuracy to plaintext SL while being 16x more efficient in terms of the runtime.
arXiv Detail & Related papers (2024-07-12T04:10:19Z)
Privacy preserving layer partitioning for Deep Neural Network models [0.21470800327528838]
Trusted Execution Environments (TEEs) can introduce significant performance overhead due to additional layers of encryption, decryption, security and integrity checks. We introduce layer partitioning technique and offloading computations to GPU. We conduct experiments to demonstrate the effectiveness of our approach in protecting against input reconstruction attacks developed using trained conditional Generative Adversarial Network(c-GAN)
arXiv Detail & Related papers (2024-04-11T02:39:48Z)
HasTEE+ : Confidential Cloud Computing and Analytics with Haskell [50.994023665559496]
Confidential computing enables the protection of confidential code and data in a co-tenanted cloud deployment using specialized hardware isolation units called Trusted Execution Environments (TEEs) TEEs offer low-level C/C++-based toolchains that are susceptible to inherent memory safety vulnerabilities and lack language constructs to monitor explicit and implicit information-flow leaks. We address the above with HasTEE+, a domain-specific language (cla) embedded in Haskell that enables programming TEEs in a high-level language with strong type-safety.
arXiv Detail & Related papers (2024-01-17T00:56:23Z)
Split-and-Denoise: Protect large language model inference with local differential privacy [2.572566198588905]
Split-N-Denoise (SnD) is a private inference framework that splits the model to execute the token embedding layer on the client side at minimal computational cost. We show SnD's effectiveness in optimizing the privacy-utility tradeoff across various LLM architectures and diverse downstream tasks.
arXiv Detail & Related papers (2023-10-13T14:17:33Z)
Federated Nearest Neighbor Machine Translation [66.8765098651988]
In this paper, we propose a novel federated nearest neighbor (FedNN) machine translation framework. FedNN leverages one-round memorization-based interaction to share knowledge across different clients. Experiments show that FedNN significantly reduces computational and communication costs compared with FedAvg.
arXiv Detail & Related papers (2023-02-23T18:04:07Z)
Over-the-Air Federated Learning with Privacy Protection via Correlated Additive Perturbations [57.20885629270732]
We consider privacy aspects of wireless federated learning with Over-the-Air (OtA) transmission of gradient updates from multiple users/agents to an edge server. Traditional perturbation-based methods provide privacy protection while sacrificing the training accuracy. In this work, we aim at minimizing privacy leakage to the adversary and the degradation of model accuracy at the edge server.
arXiv Detail & Related papers (2022-10-05T13:13:35Z)
THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption [112.02441503951297]
Privacy-preserving inference of transformer models is on the demand of cloud service users. We introduce $textitTHE-X$, an approximation approach for transformers, which enables privacy-preserving inference of pre-trained models.
arXiv Detail & Related papers (2022-06-01T03:49:18Z)
Dynamic Split Computing for Efficient Deep Edge Intelligence [78.4233915447056]
We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel. We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
arXiv Detail & Related papers (2022-05-23T12:35:18Z)
Serdab: An IoT Framework for Partitioning Neural Networks Computation across Multiple Enclaves [8.550865312110911]
Serdab is a distributed orchestration framework for deploying deep neural network across multiple secure enclaves. Our partitioning strategy achieves up to 4.7x speedup compared to executing the entire neural network in one enclave.
arXiv Detail & Related papers (2020-05-12T20:51:47Z)
CryptoSPN: Privacy-preserving Sum-Product Network Inference [84.88362774693914]
We present a framework for privacy-preserving inference of sum-product networks (SPNs) CryptoSPN achieves highly efficient and accurate inference in the order of seconds for medium-sized SPNs.
arXiv Detail & Related papers (2020-02-03T14:49:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.