Whispers in the Machine: Confidentiality in LLM-integrated Systems
- URL: http://arxiv.org/abs/2402.06922v3
- Date: Wed, 06 Nov 2024 10:22:27 GMT
- Title: Whispers in the Machine: Confidentiality in LLM-integrated Systems
- Authors: Jonathan Evertz, Merlin Chlosta, Lea Schönherr, Thorsten Eisenhofer,
- Abstract summary: Large Language Models (LLMs) are increasingly augmented with external tools and commercial services into LLM-integrated systems.
Manipulated integrations can exploit the model and compromise sensitive data accessed through other interfaces.
We introduce a systematic approach to evaluate confidentiality risks in LLM-integrated systems.
- Score: 7.893457690926516
- License:
- Abstract: Large Language Models (LLMs) are increasingly augmented with external tools and commercial services into LLM-integrated systems. While these interfaces can significantly enhance the capabilities of the models, they also introduce a new attack surface. Manipulated integrations, for example, can exploit the model and compromise sensitive data accessed through other interfaces. While previous work primarily focused on attacks targeting a model's alignment or the leakage of training data, the security of data that is only available during inference has escaped scrutiny so far. In this work, we demonstrate the vulnerabilities associated with external components and introduce a systematic approach to evaluate confidentiality risks in LLM-integrated systems. We identify two specific attack scenarios unique to these systems and formalize these into a tool-robustness framework designed to measure a model's ability to protect sensitive information. Our findings show that all examined models are highly vulnerable to confidentiality attacks, with the risk increasing significantly when models are used together with external tools.
Related papers
- "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models [74.05368440735468]
Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs)
In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases.
arXiv Detail & Related papers (2024-06-26T05:36:23Z) - Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress.
Our investigation exposes a critical oversight in this belief.
By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Assessing Privacy Risks in Language Models: A Case Study on
Summarization Tasks [65.21536453075275]
We focus on the summarization task and investigate the membership inference (MI) attack.
We exploit text similarity and the model's resistance to document modifications as potential MI signals.
We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
arXiv Detail & Related papers (2023-10-20T05:44:39Z) - A Blackbox Model Is All You Need to Breach Privacy: Smart Grid
Forecasting Models as a Use Case [0.7714988183435832]
We show that a black box access to an LSTM model can reveal a significant amount of information equivalent to having access to the data itself.
This highlights the importance of protecting forecasting models at the same level as the data.
arXiv Detail & Related papers (2023-09-04T11:07:37Z) - On the Evaluation of User Privacy in Deep Neural Networks using Timing
Side Channel [14.350301915592027]
We identify and report a novel data-dependent timing side-channel leakage (termed Class Leakage) in Deep Learning (DL) implementations.
We demonstrate a practical inference-time attack where an adversary with user privilege and hard-label blackbox access to an ML can exploit Class Leakage.
We develop an easy-to-implement countermeasure by making a constant-time branching operation that alleviates the Class Leakage.
arXiv Detail & Related papers (2022-08-01T19:38:16Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks,
and Defenses [150.64470864162556]
This work systematically categorizes and discusses a wide range of dataset vulnerabilities and exploits.
In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.
arXiv Detail & Related papers (2020-12-18T22:38:47Z) - Risk Management Framework for Machine Learning Security [7.678455181587705]
Adversarial attacks for machine learning models have become a highly studied topic both in academia and industry.
In this paper, we outline a novel framework to guide the risk management process for organizations reliant on machine learning models.
arXiv Detail & Related papers (2020-12-09T06:21:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.