Assessing Prompt Injection Risks in 200+ Custom GPTs
- URL: http://arxiv.org/abs/2311.11538v2
- Date: Sat, 25 May 2024 05:32:39 GMT
- Title: Assessing Prompt Injection Risks in 200+ Custom GPTs
- Authors: Jiahao Yu, Yuhang Wu, Dong Shu, Mingyu Jin, Sabrina Yang, Xinyu Xing,
- Abstract summary: This study reveals a significant security vulnerability inherent in user-customized GPTs: prompt injection attacks.
Through prompt injection, an adversary can not only extract the customized system prompts but also access the uploaded files.
This paper provides a first-hand analysis of the prompt injection, alongside the evaluation of the possible mitigation of such attacks.
- Score: 21.86130076843285
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the rapidly evolving landscape of artificial intelligence, ChatGPT has been widely used in various applications. The new feature - customization of ChatGPT models by users to cater to specific needs has opened new frontiers in AI utility. However, this study reveals a significant security vulnerability inherent in these user-customized GPTs: prompt injection attacks. Through comprehensive testing of over 200 user-designed GPT models via adversarial prompts, we demonstrate that these systems are susceptible to prompt injections. Through prompt injection, an adversary can not only extract the customized system prompts but also access the uploaded files. This paper provides a first-hand analysis of the prompt injection, alongside the evaluation of the possible mitigation of such attacks. Our findings underscore the urgent need for robust security frameworks in the design and deployment of customizable GPT models. The intent of this paper is to raise awareness and prompt action in the AI community, ensuring that the benefits of GPT customization do not come at the cost of compromised security and privacy.
Related papers
- Improving the Shortest Plank: Vulnerability-Aware Adversarial Training for Robust Recommender System [60.719158008403376]
Vulnerability-aware Adversarial Training (VAT) is designed to defend against poisoning attacks in recommender systems.
VAT employs a novel vulnerability-aware function to estimate users' vulnerability based on the degree to which the system fits them.
arXiv Detail & Related papers (2024-09-26T02:24:03Z) - Rethinking the Vulnerabilities of Face Recognition Systems:From a Practical Perspective [53.24281798458074]
Face Recognition Systems (FRS) have increasingly integrated into critical applications, including surveillance and user authentication.
Recent studies have revealed vulnerabilities in FRS to adversarial (e.g., adversarial patch attacks) and backdoor attacks (e.g., training data poisoning)
arXiv Detail & Related papers (2024-05-21T13:34:23Z) - Reconstruct Your Previous Conversations! Comprehensively Investigating Privacy Leakage Risks in Conversations with GPT Models [20.92843974858305]
GPT models are increasingly being used for task optimization.
In this paper, we introduce a straightforward yet potent Conversation Reconstruction Attack.
We present two advanced attacks targeting improved reconstruction of past conversations.
arXiv Detail & Related papers (2024-02-05T13:18:42Z) - Signed-Prompt: A New Approach to Prevent Prompt Injection Attacks
Against LLM-Integrated Applications [0.0]
This paper introduces the 'Signed-Prompt' method as a novel solution for prompt injection attacks.
The study involves signing sensitive instructions within command segments by authorized users, enabling the LLM to discern trusted instruction sources.
Experiments demonstrate the effectiveness of the Signed-Prompt method, showing substantial resistance to various types of prompt injection attacks.
arXiv Detail & Related papers (2024-01-15T11:44:18Z) - Opening A Pandora's Box: Things You Should Know in the Era of Custom GPTs [27.97654690288698]
We conduct a comprehensive analysis of the security and privacy issues arising from the custom GPT platform by OpenAI.
Our systematic examination categorizes potential attack scenarios into three threat models based on the role of the malicious actor.
We identify 26 potential attack vectors, with 19 being partially or fully validated in real-world settings.
arXiv Detail & Related papers (2023-12-31T16:49:12Z) - Prompt-Enhanced Software Vulnerability Detection Using ChatGPT [9.35868869848051]
Large language models (LLMs) like GPT have received considerable attention due to their stunning intelligence.
This paper launches a study on the performance of software vulnerability detection using ChatGPT with different prompt designs.
arXiv Detail & Related papers (2023-08-24T10:30:33Z) - When Authentication Is Not Enough: On the Security of Behavioral-Based Driver Authentication Systems [53.2306792009435]
We develop two lightweight driver authentication systems based on Random Forest and Recurrent Neural Network architectures.
We are the first to propose attacks against these systems by developing two novel evasion attacks, SMARTCAN and GANCAN.
Through our contributions, we aid practitioners in safely adopting these systems, help reduce car thefts, and enhance driver security.
arXiv Detail & Related papers (2023-06-09T14:33:26Z) - Not what you've signed up for: Compromising Real-World LLM-Integrated
Applications with Indirect Prompt Injection [64.67495502772866]
Large Language Models (LLMs) are increasingly being integrated into various applications.
We show how attackers can override original instructions and employed controls using Prompt Injection attacks.
We derive a comprehensive taxonomy from a computer security perspective to systematically investigate impacts and vulnerabilities.
arXiv Detail & Related papers (2023-02-23T17:14:38Z) - Face Presentation Attack Detection [59.05779913403134]
Face recognition technology has been widely used in daily interactive applications such as checking-in and mobile payment.
However, its vulnerability to presentation attacks (PAs) limits its reliable use in ultra-secure applicational scenarios.
arXiv Detail & Related papers (2022-12-07T14:51:17Z) - Towards Automated Classification of Attackers' TTPs by combining NLP
with ML Techniques [77.34726150561087]
We evaluate and compare different Natural Language Processing (NLP) and machine learning techniques used for security information extraction in research.
Based on our investigations we propose a data processing pipeline that automatically classifies unstructured text according to attackers' tactics and techniques.
arXiv Detail & Related papers (2022-07-18T09:59:21Z) - Texture-based Presentation Attack Detection for Automatic Speaker
Verification [21.357976330739245]
This paper reports our exploration of texture descriptors applied to the analysis of speech spectrogram images.
In particular, we propose a common fisher vector feature space based on a generative model.
At most, 16 in 100 bona fide presentations are rejected whereas only one in 100 attack presentations are accepted.
arXiv Detail & Related papers (2020-10-08T15:03:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.