Safeguarding LLM Embeddings in End-Cloud Collaboration via Entropy-Driven Perturbation
- URL: http://arxiv.org/abs/2503.12896v1
- Date: Mon, 17 Mar 2025 07:58:05 GMT
- Title: Safeguarding LLM Embeddings in End-Cloud Collaboration via Entropy-Driven Perturbation
- Authors: Shuaifan Jin, Xiaoyi Pang, Zhibo Wang, He Wang, Jiacheng Du, Jiahui Hu, Kui Ren,
- Abstract summary: EntroGuard is an entropy-driven embedding privacy protection method.<n>It can protect the privacy of text embeddings while maintaining retrieval accuracy during the end-cloud collaboration.
- Score: 16.419373701694067
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies improve on-device language model (LM) inference through end-cloud collaboration, where the end device retrieves useful information from cloud databases to enhance local processing, known as Retrieval-Augmented Generation (RAG). Typically, to retrieve information from the cloud while safeguarding privacy, the end device transforms original data into embeddings with a local embedding model. However, the recently emerging Embedding Inversion Attacks (EIAs) can still recover the original data from text embeddings (e.g., training a recovery model to map embeddings back to original texts), posing a significant threat to user privacy. To address this risk, we propose EntroGuard, an entropy-driven perturbation-based embedding privacy protection method, which can protect the privacy of text embeddings while maintaining retrieval accuracy during the end-cloud collaboration. Specifically, to defeat various EIAs, we perturb the embeddings to increase the entropy of the recovered text in the common structure of recovery models, thus steering the embeddings toward meaningless texts rather than original sensitive texts during the recovery process. To maintain retrieval performance in the cloud, we constrain the perturbations within a bound, applying the strategy of reducing them where redundant and increasing them where sparse. Moreover, EntroGuard can be directly integrated into end devices without requiring any modifications to the embedding model. Extensive experimental results demonstrate that EntroGuard can reduce the risk of privacy leakage by up to 8 times at most with negligible loss of retrieval performance compared to existing privacy-preserving methods.
Related papers
- PersGuard: Preventing Malicious Personalization via Backdoor Attacks on Pre-trained Text-to-Image Diffusion Models [51.458089902581456]
We introduce PersGuard, a novel backdoor-based approach that prevents malicious personalization of specific images.<n>Our method significantly outperforms existing techniques, offering a more robust solution for privacy and copyright protection.
arXiv Detail & Related papers (2025-02-22T09:47:55Z) - ALGEN: Few-shot Inversion Attacks on Textual Embeddings using Alignment and Generation [9.220337458064765]
We present a Few-shot Textual Embedding Inversion Attack using ALignment and GENeration (ALGEN)<n>We find that ALGEN attacks can be effectively transferred across domains and languages, revealing key information.<n>We establish a new textual embedding inversion paradigm with broader applications for embedding alignment in NLP.
arXiv Detail & Related papers (2025-02-16T23:11:13Z) - Mitigating Privacy Risks in LLM Embeddings from Embedding Inversion [21.83264152003852]
We introduce Eguard, a novel defense mechanism designed to mitigate embedding inversion attacks.
Our approach significantly reduces privacy risks, protecting over 95% of tokens from inversion while maintaining high performance.
arXiv Detail & Related papers (2024-11-06T14:42:41Z) - Subword Embedding from Bytes Gains Privacy without Sacrificing Accuracy and Complexity [5.7601856226895665]
We propose Subword Embedding from Bytes (SEB) and encode subwords to byte sequences using deep neural networks.
Our solution outperforms conventional approaches by preserving privacy without sacrificing efficiency or accuracy.
We verify SEB obtains comparable and even better results over standard subword embedding methods in machine translation, sentiment analysis, and language modeling.
arXiv Detail & Related papers (2024-10-21T18:25:24Z) - Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data [51.41288763521186]
Retrieval-augmented generation (RAG) enhances the outputs of language models by integrating relevant information retrieved from external knowledge sources.<n>RAG systems may face severe privacy risks when retrieving private data.<n>We propose using synthetic data as a privacy-preserving alternative for the retrieval data.
arXiv Detail & Related papers (2024-06-20T22:53:09Z) - Information Leakage from Embedding in Large Language Models [5.475800773759642]
This study aims to investigate the potential for privacy invasion through input reconstruction attacks.
We first propose two base methods to reconstruct original texts from a model's hidden states.
We then present Embed Parrot, a Transformer-based method, to reconstruct input from embeddings in deep layers.
arXiv Detail & Related papers (2024-05-20T09:52:31Z) - The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented
Generation (RAG) [56.67603627046346]
Retrieval-augmented generation (RAG) is a powerful technique to facilitate language model with proprietary and private data.
In this work, we conduct empirical studies with novel attack methods, which demonstrate the vulnerability of RAG systems on leaking the private retrieval database.
arXiv Detail & Related papers (2024-02-23T18:35:15Z) - InferDPT: Privacy-Preserving Inference for Black-box Large Language Model [66.07752875835506]
InferDPT is the first practical framework for the privacy-preserving Inference of black-box LLMs.<n>RANTEXT is a novel differential privacy mechanism integrated into the perturbation module of InferDPT.
arXiv Detail & Related papers (2023-10-18T18:00:11Z) - Over-the-Air Federated Learning with Privacy Protection via Correlated
Additive Perturbations [57.20885629270732]
We consider privacy aspects of wireless federated learning with Over-the-Air (OtA) transmission of gradient updates from multiple users/agents to an edge server.
Traditional perturbation-based methods provide privacy protection while sacrificing the training accuracy.
In this work, we aim at minimizing privacy leakage to the adversary and the degradation of model accuracy at the edge server.
arXiv Detail & Related papers (2022-10-05T13:13:35Z) - Robbing the Fed: Directly Obtaining Private Data in Federated Learning
with Modified Models [56.0250919557652]
Federated learning has quickly gained popularity with its promises of increased user privacy and efficiency.
Previous attacks on user privacy have been limited in scope and do not scale to gradient updates aggregated over even a handful of data points.
We introduce a new threat model based on minimal but malicious modifications of the shared model architecture.
arXiv Detail & Related papers (2021-10-25T15:52:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.