Related papers: Secure Multiparty Generative AI

Secure Multiparty Generative AI

URL: http://arxiv.org/abs/2409.19120v1
Date: Fri, 27 Sep 2024 19:55:49 GMT
Title: Secure Multiparty Generative AI
Authors: Manil Shrestha, Yashodha Ravichandran, Edward Kim,
Abstract summary: As usage of generative AI tools skyrockets, the amount of sensitive information being exposed to these models is alarming. In our research, we present a secure and private methodology for generative artificial intelligence that does not expose sensitive data or models to third-party AI providers.
Score: 1.4433703131122861
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As usage of generative AI tools skyrockets, the amount of sensitive information being exposed to these models and centralized model providers is alarming. For example, confidential source code from Samsung suffered a data leak as the text prompt to ChatGPT encountered data leakage. An increasing number of companies are restricting the use of LLMs (Apple, Verizon, JPMorgan Chase, etc.) due to data leakage or confidentiality issues. Also, an increasing number of centralized generative model providers are restricting, filtering, aligning, or censoring what can be used. Midjourney and RunwayML, two of the major image generation platforms, restrict the prompts to their system via prompt filtering. Certain political figures are restricted from image generation, as well as words associated with women's health care, rights, and abortion. In our research, we present a secure and private methodology for generative artificial intelligence that does not expose sensitive data or models to third-party AI providers. Our work modifies the key building block of modern generative AI algorithms, e.g. the transformer, and introduces confidential and verifiable multiparty computations in a decentralized network to maintain the 1) privacy of the user input and obfuscation to the output of the model, and 2) introduce privacy to the model itself. Additionally, the sharding process reduces the computational burden on any one node, enabling the distribution of resources of large generative AI processes across multiple, smaller nodes. We show that as long as there exists one honest node in the decentralized computation, security is maintained. We also show that the inference process will still succeed if only a majority of the nodes in the computation are successful. Thus, our method offers both secure and verifiable computation in a decentralized network.

Related papers

Privacy Preservation in Gen AI Applications [0.0]
Large Language Models (LLMs) may unintentionally absorb and reveal Personally Identifiable Information (PII) from user interactions. Deep neural networks' intricacy makes it difficult to track down or stop the inadvertent storing and release of private information. This study tackles these issues by detecting Generative AI weaknesses through attacks such as data extraction, model inversion, and membership inference. It ensures privacy without sacrificing functionality by using methods to identify, alter, or remove PII before to dealing with LLMs.
arXiv Detail & Related papers (2025-04-12T06:19:37Z)
Privacy-Preserving Decentralized AI with Confidential Computing [0.7893328752331561]
This paper addresses privacy protection in decentralized Artificial Intelligence (AI) using Confidential Computing (CC) within the Atoma Network. CC leverages hardware-based Trusted Execution Environments (TEEs) to provide isolation for processing sensitive data. We explore how we can integrate TEEs into Atoma's decentralized framework.
arXiv Detail & Related papers (2024-10-17T16:50:48Z)
JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models [53.83273575102087]
We propose an unsupervised inference-time approach to authorship obfuscation. We introduce JAMDEC, a user-controlled, inference-time algorithm for authorship obfuscation. Our approach builds on small language models such as GPT2-XL in order to help avoid disclosing the original content to proprietary LLM's APIs.
arXiv Detail & Related papers (2024-02-13T19:54:29Z)
$\Lambda$-Split: A Privacy-Preserving Split Computing Framework for Cloud-Powered Generative AI [3.363904632882723]
We introduce $Lambda$-Split, a split computing framework to facilitate computational offloading. In $Lambda$-Split, a generative model, usually a deep neural network (DNN), is partitioned into three sub-models and distributed across the user's local device and a cloud server. This architecture ensures that only the hidden layer outputs are transmitted, thereby preventing the external transmission of privacy-sensitive raw input and output data.
arXiv Detail & Related papers (2023-10-23T07:44:04Z)
Differentially Private Secure Multiplication: Hiding Information in the Rubble of Noise [7.767656876470667]
We consider the problem of private distributed multi-party multiplication. It is well-established that Shamir secret-sharing coding strategies can enable perfect information-theoretic privacy in distributed computation.
arXiv Detail & Related papers (2023-09-28T02:13:13Z)
BAGM: A Backdoor Attack for Manipulating Text-to-Image Generative Models [54.19289900203071]
The rise in popularity of text-to-image generative artificial intelligence has attracted widespread public interest. We demonstrate that this technology can be attacked to generate content that subtly manipulates its users. We propose a Backdoor Attack on text-to-image Generative Models (BAGM) Our attack is the first to target three popular text-to-image generative models across three stages of the generative process.
arXiv Detail & Related papers (2023-07-31T08:34:24Z)
HE-MAN -- Homomorphically Encrypted MAchine learning with oNnx models [0.23624125155742057]
homomorphic encryption (FHE) is a promising technique to enable individuals using ML services without giving up privacy. We introduce HE-MAN, an open-source machine learning toolset for privacy preserving inference with ONNX models and homomorphically encrypted data. Compared to prior work, HE-MAN supports a broad range of ML models in ONNX format out of the box without sacrificing accuracy.
arXiv Detail & Related papers (2023-02-16T12:37:14Z)
Collaborative Unsupervised Visual Representation Learning from Decentralized Data [34.06624704343615]
We propose a novel federated unsupervised learning framework, FedU. In this framework, each party trains models from unlabeled data independently using contrastive learning with an online network and a target network. FedU preserves data privacy as each party only has access to its raw data.
arXiv Detail & Related papers (2021-08-14T08:34:11Z)
NeuraCrypt: Hiding Private Health Data via Random Neural Networks for Public Training [64.54200987493573]
We propose NeuraCrypt, a private encoding scheme based on random deep neural networks. NeuraCrypt encodes raw patient data using a randomly constructed neural network known only to the data-owner. We show that NeuraCrypt achieves competitive accuracy to non-private baselines on a variety of x-ray tasks.
arXiv Detail & Related papers (2021-06-04T13:42:21Z)
Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles. Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center. We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes. A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z)
Faster Secure Data Mining via Distributed Homomorphic Encryption [108.77460689459247]
Homomorphic Encryption (HE) is receiving more and more attention recently for its capability to do computations over the encrypted field. We propose a novel general distributed HE-based data mining framework towards one step of solving the scaling problem. We verify the efficiency and effectiveness of our new framework by testing over various data mining algorithms and benchmark data-sets.
arXiv Detail & Related papers (2020-06-17T18:14:30Z)
InfoScrub: Towards Attribute Privacy by Targeted Obfuscation [77.49428268918703]
We study techniques that allow individuals to limit the private information leaked in visual data. We tackle this problem in a novel image obfuscation framework. We find our approach generates obfuscated images faithful to the original input images, and additionally increase uncertainty by 6.2$times$ (or up to 0.85 bits) over the non-obfuscated counterparts.
arXiv Detail & Related papers (2020-05-20T19:48:04Z)
CryptoSPN: Privacy-preserving Sum-Product Network Inference [84.88362774693914]
We present a framework for privacy-preserving inference of sum-product networks (SPNs) CryptoSPN achieves highly efficient and accurate inference in the order of seconds for medium-sized SPNs.
arXiv Detail & Related papers (2020-02-03T14:49:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.