$\Lambda$-Split: A Privacy-Preserving Split Computing Framework for
Cloud-Powered Generative AI
- URL: http://arxiv.org/abs/2310.14651v1
- Date: Mon, 23 Oct 2023 07:44:04 GMT
- Title: $\Lambda$-Split: A Privacy-Preserving Split Computing Framework for
Cloud-Powered Generative AI
- Authors: Shoki Ohta, Takayuki Nishio
- Abstract summary: We introduce $Lambda$-Split, a split computing framework to facilitate computational offloading.
In $Lambda$-Split, a generative model, usually a deep neural network (DNN), is partitioned into three sub-models and distributed across the user's local device and a cloud server.
This architecture ensures that only the hidden layer outputs are transmitted, thereby preventing the external transmission of privacy-sensitive raw input and output data.
- Score: 3.363904632882723
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the wake of the burgeoning expansion of generative artificial intelligence
(AI) services, the computational demands inherent to these technologies
frequently necessitate cloud-powered computational offloading, particularly for
resource-constrained mobile devices. These services commonly employ prompts to
steer the generative process, and both the prompts and the resultant content,
such as text and images, may harbor privacy-sensitive or confidential
information, thereby elevating security and privacy risks. To mitigate these
concerns, we introduce $\Lambda$-Split, a split computing framework to
facilitate computational offloading while simultaneously fortifying data
privacy against risks such as eavesdropping and unauthorized access. In
$\Lambda$-Split, a generative model, usually a deep neural network (DNN), is
partitioned into three sub-models and distributed across the user's local
device and a cloud server: the input-side and output-side sub-models are
allocated to the local, while the intermediate, computationally-intensive
sub-model resides on the cloud server. This architecture ensures that only the
hidden layer outputs are transmitted, thereby preventing the external
transmission of privacy-sensitive raw input and output data. Given the
black-box nature of DNNs, estimating the original input or output from
intercepted hidden layer outputs poses a significant challenge for malicious
eavesdroppers. Moreover, $\Lambda$-Split is orthogonal to traditional
encryption-based security mechanisms, offering enhanced security when deployed
in conjunction. We empirically validate the efficacy of the $\Lambda$-Split
framework using Llama 2 and Stable Diffusion XL, representative large language
and diffusion models developed by Meta and Stability AI, respectively. Our
$\Lambda$-Split implementation is publicly accessible at
https://github.com/nishio-laboratory/lambda_split.
Related papers
- Secure Multiparty Generative AI [1.4433703131122861]
As usage of generative AI tools skyrockets, the amount of sensitive information being exposed to these models is alarming.
In our research, we present a secure and private methodology for generative artificial intelligence that does not expose sensitive data or models to third-party AI providers.
arXiv Detail & Related papers (2024-09-27T19:55:49Z) - CURE: Privacy-Preserving Split Learning Done Right [1.388112207221632]
Homomorphic encryption (HE)-based solutions exist for this scenario but often impose prohibitive computational burdens.
CURE is a novel system that encrypts only the server side of the model and the data.
We demonstrate CURE can achieve similar accuracy to plaintext SL while being 16x more efficient in terms of the runtime.
arXiv Detail & Related papers (2024-07-12T04:10:19Z) - Privacy preserving layer partitioning for Deep Neural Network models [0.21470800327528838]
Trusted Execution Environments (TEEs) can introduce significant performance overhead due to additional layers of encryption, decryption, security and integrity checks.
We introduce layer partitioning technique and offloading computations to GPU.
We conduct experiments to demonstrate the effectiveness of our approach in protecting against input reconstruction attacks developed using trained conditional Generative Adversarial Network(c-GAN)
arXiv Detail & Related papers (2024-04-11T02:39:48Z) - HasTEE+ : Confidential Cloud Computing and Analytics with Haskell [50.994023665559496]
Confidential computing enables the protection of confidential code and data in a co-tenanted cloud deployment using specialized hardware isolation units called Trusted Execution Environments (TEEs)
TEEs offer low-level C/C++-based toolchains that are susceptible to inherent memory safety vulnerabilities and lack language constructs to monitor explicit and implicit information-flow leaks.
We address the above with HasTEE+, a domain-specific language (cla) embedded in Haskell that enables programming TEEs in a high-level language with strong type-safety.
arXiv Detail & Related papers (2024-01-17T00:56:23Z) - Split-and-Denoise: Protect large language model inference with local differential privacy [2.572566198588905]
Split-N-Denoise (SnD) is a private inference framework that splits the model to execute the token embedding layer on the client side at minimal computational cost.
We show SnD's effectiveness in optimizing the privacy-utility tradeoff across various LLM architectures and diverse downstream tasks.
arXiv Detail & Related papers (2023-10-13T14:17:33Z) - Federated Nearest Neighbor Machine Translation [66.8765098651988]
In this paper, we propose a novel federated nearest neighbor (FedNN) machine translation framework.
FedNN leverages one-round memorization-based interaction to share knowledge across different clients.
Experiments show that FedNN significantly reduces computational and communication costs compared with FedAvg.
arXiv Detail & Related papers (2023-02-23T18:04:07Z) - Over-the-Air Federated Learning with Privacy Protection via Correlated
Additive Perturbations [57.20885629270732]
We consider privacy aspects of wireless federated learning with Over-the-Air (OtA) transmission of gradient updates from multiple users/agents to an edge server.
Traditional perturbation-based methods provide privacy protection while sacrificing the training accuracy.
In this work, we aim at minimizing privacy leakage to the adversary and the degradation of model accuracy at the edge server.
arXiv Detail & Related papers (2022-10-05T13:13:35Z) - THE-X: Privacy-Preserving Transformer Inference with Homomorphic
Encryption [112.02441503951297]
Privacy-preserving inference of transformer models is on the demand of cloud service users.
We introduce $textitTHE-X$, an approximation approach for transformers, which enables privacy-preserving inference of pre-trained models.
arXiv Detail & Related papers (2022-06-01T03:49:18Z) - Dynamic Split Computing for Efficient Deep Edge Intelligence [78.4233915447056]
We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel.
We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
arXiv Detail & Related papers (2022-05-23T12:35:18Z) - Serdab: An IoT Framework for Partitioning Neural Networks Computation
across Multiple Enclaves [8.550865312110911]
Serdab is a distributed orchestration framework for deploying deep neural network across multiple secure enclaves.
Our partitioning strategy achieves up to 4.7x speedup compared to executing the entire neural network in one enclave.
arXiv Detail & Related papers (2020-05-12T20:51:47Z) - CryptoSPN: Privacy-preserving Sum-Product Network Inference [84.88362774693914]
We present a framework for privacy-preserving inference of sum-product networks (SPNs)
CryptoSPN achieves highly efficient and accurate inference in the order of seconds for medium-sized SPNs.
arXiv Detail & Related papers (2020-02-03T14:49:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.