Has My System Prompt Been Used? Large Language Model Prompt Membership Inference
- URL: http://arxiv.org/abs/2502.09974v1
- Date: Fri, 14 Feb 2025 08:00:42 GMT
- Title: Has My System Prompt Been Used? Large Language Model Prompt Membership Inference
- Authors: Roman Levin, Valeriia Cherepanova, Abhimanyu Hans, Avi Schwarzschild, Tom Goldstein,
- Abstract summary: We develop Prompt Detective, a statistical method to reliably determine whether a given system prompt was used by a third-party language model.
Our work reveals that even minor changes in system prompts manifest in distinct response distributions, enabling us to verify prompt usage with statistical significance.
- Score: 56.20586932251531
- License:
- Abstract: Prompt engineering has emerged as a powerful technique for optimizing large language models (LLMs) for specific applications, enabling faster prototyping and improved performance, and giving rise to the interest of the community in protecting proprietary system prompts. In this work, we explore a novel perspective on prompt privacy through the lens of membership inference. We develop Prompt Detective, a statistical method to reliably determine whether a given system prompt was used by a third-party language model. Our approach relies on a statistical test comparing the distributions of two groups of model outputs corresponding to different system prompts. Through extensive experiments with a variety of language models, we demonstrate the effectiveness of Prompt Detective for prompt membership inference. Our work reveals that even minor changes in system prompts manifest in distinct response distributions, enabling us to verify prompt usage with statistical significance.
Related papers
- Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection [30.836788377666]
We propose an adaptive prompting approach that predicts the optimal prompt composition ad-hoc for a given input.
We apply our approach to social bias detection, a highly context-dependent task that requires semantic understanding.
Our approach robustly ensures high detection performance, and is best in several settings.
arXiv Detail & Related papers (2025-02-10T14:06:19Z) - Prompt Exploration with Prompt Regression [38.847668543140315]
We propose a framework, Prompt Exploration with Prompt Regression (PEPR), to predict the effect of prompt combinations given results for individual prompt elements.
We evaluate our approach with open-source LLMs of different sizes on several different tasks.
arXiv Detail & Related papers (2024-05-17T20:30:49Z) - DialCLIP: Empowering CLIP as Multi-Modal Dialog Retriever [83.33209603041013]
We propose a parameter-efficient prompt-tuning method named DialCLIP for multi-modal dialog retrieval.
Our approach introduces a multi-modal context generator to learn context features which are distilled into prompts within the pre-trained vision-language model CLIP.
To facilitate various types of retrieval, we also design multiple experts to learn mappings from CLIP outputs to multi-modal representation space.
arXiv Detail & Related papers (2024-01-02T07:40:12Z) - Demystifying Prompts in Language Models via Perplexity Estimation [109.59105230163041]
Performance of a prompt is coupled with the extent to which the model is familiar with the language it contains.
We show that the lower the perplexity of the prompt is, the better the prompt is able to perform the task.
arXiv Detail & Related papers (2022-12-08T02:21:47Z) - TEMPERA: Test-Time Prompting via Reinforcement Learning [57.48657629588436]
We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA)
In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge.
Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
arXiv Detail & Related papers (2022-11-21T22:38:20Z) - Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation
with Large Language Models [116.25562358482962]
State-of-the-art neural language models can be used to solve ad-hoc language tasks without the need for supervised training.
PromptIDE allows users to experiment with prompt variations, visualize prompt performance, and iteratively optimize prompts.
arXiv Detail & Related papers (2022-08-16T17:17:53Z) - Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified
Multilingual Prompt [98.26682501616024]
We propose a novel model that uses a unified prompt for all languages, called UniPrompt.
The unified prompt is computation by a multilingual PLM to produce language-independent representation.
Our proposed methods can significantly outperform the strong baselines across different languages.
arXiv Detail & Related papers (2022-02-23T11:57:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.