When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs
- URL: http://arxiv.org/abs/2506.00197v1
- Date: Fri, 30 May 2025 20:08:08 GMT
- Title: When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs
- Authors: Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang,
- Abstract summary: We present a comprehensive risk assessment of knowledge file leakage, leveraging a novel workflow inspired by Data Security Posture Management (DSPM)<n>Through the analysis of 651,022 GPT metadata, 11,820 flows, and 1,466 responses, we identify five leakage vectors.<n>These vectors enable adversaries to extract sensitive knowledge file data such as titles, content, types, and sizes.
- Score: 39.885773438374095
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge files have been widely used in large language model (LLM) agents, such as GPTs, to improve response quality. However, concerns about the potential leakage of knowledge files have grown significantly. Existing studies demonstrate that adversarial prompts can induce GPTs to leak knowledge file content. Yet, it remains uncertain whether additional leakage vectors exist, particularly given the complex data flows across clients, servers, and databases in GPTs. In this paper, we present a comprehensive risk assessment of knowledge file leakage, leveraging a novel workflow inspired by Data Security Posture Management (DSPM). Through the analysis of 651,022 GPT metadata, 11,820 flows, and 1,466 responses, we identify five leakage vectors: metadata, GPT initialization, retrieval, sandboxed execution environments, and prompts. These vectors enable adversaries to extract sensitive knowledge file data such as titles, content, types, and sizes. Notably, the activation of the built-in tool Code Interpreter leads to a privilege escalation vulnerability, enabling adversaries to directly download original knowledge files with a 95.95% success rate. Further analysis reveals that 28.80% of leaked files are copyrighted, including digital copies from major publishers and internal materials from a listed company. In the end, we provide actionable solutions for GPT builders and platform providers to secure the GPT data supply chain.
Related papers
- Decompiling Smart Contracts with a Large Language Model [51.49197239479266]
Despite Etherscan's 78,047,845 smart contracts deployed on (as of May 26, 2025), a mere 767,520 ( 1%) are open source.<n>This opacity necessitates the automated semantic analysis of on-chain smart contract bytecode.<n>We introduce a pioneering decompilation pipeline that transforms bytecode into human-readable and semantically faithful Solidity code.
arXiv Detail & Related papers (2025-06-24T13:42:59Z) - Detecting Hard-Coded Credentials in Software Repositories via LLMs [0.0]
Software developers frequently hard-code credentials such as passwords, generic secrets, private keys, and generic tokens in software repositories.<n>These credentials create attack surfaces exploitable by a potential adversary to conduct malicious exploits such as backdoor attacks.<n>Recent detection efforts utilize embedding models to vectorize textual credentials before passing them to classifiers for predictions.<n>Our model outperforms the current state-of-the-art by 13% in F1 measure on the benchmark dataset.
arXiv Detail & Related papers (2025-06-16T04:33:48Z) - Privacy and Security Threat for OpenAI GPTs [0.0]
Since OpenAI's release in November 2023, over 3 million custom GPTs have been created.<n>For developers, instruction leaking attacks threaten the intellectual property of instructions in custom GPTs.<n>For users, unwanted data access behavior by custom GPTs or integrated third-party services raises significant privacy concerns.
arXiv Detail & Related papers (2025-06-04T14:58:29Z) - A Large-Scale Empirical Analysis of Custom GPTs' Vulnerabilities in the OpenAI Ecosystem [5.455788617334495]
We analyze 14,904 custom GPTs to assess their susceptibility to seven exploitable threats.<n>Our findings reveal that over 95% of custom GPTs lack adequate security protections.<n>These results highlight the urgent need for enhanced security measures and stricter content moderation.
arXiv Detail & Related papers (2025-05-13T00:51:07Z) - Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models [52.439289085318634]
We show how to identify training data known to proprietary large language models (LLMs) by using information-guided probes.<n>Our work builds on a key observation: text passages with high surprisal are good search material for memorization probes.
arXiv Detail & Related papers (2025-03-15T10:19:15Z) - Dataset Protection via Watermarked Canaries in Retrieval-Augmented LLMs [67.0310240737424]
We introduce a novel approach to safeguard the ownership of text datasets and effectively detect unauthorized use by the RA-LLMs.<n>Our approach preserves the original data completely unchanged while protecting it by inserting specifically designed canary documents into the IP dataset.<n>During the detection process, unauthorized usage is identified by querying the canary documents and analyzing the responses of RA-LLMs.
arXiv Detail & Related papers (2025-02-15T04:56:45Z) - Detection of LLM-Generated Java Code Using Discretized Nested Bigrams [0.0]
We propose new Discretized Nested Bigram Frequency features on source code groups of various sizes.<n>Compared to prior work, improvements are obtained by representing sparse information in dense membership bins.<n>Our approach scales well to larger data sets, and we achieved 99% accuracy and 0.999 AUC for 76,089 files and over 1,000 authors with GPT 4o using 227 features.
arXiv Detail & Related papers (2025-02-07T14:32:20Z) - Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems [22.142588104314175]
We study the risk of datastore leakage in Retrieval-In-Context RAG Language Models (LMs)
We show that an adversary can exploit LMs' instruction-following capabilities to easily extract text data verbatim from the datastore.
We design an attack that can cause datastore leakage with a 100% success rate on 25 randomly selected customized GPTs with at most 2 queries.
arXiv Detail & Related papers (2024-02-27T19:08:05Z) - GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis [26.081673382969615]
We propose GPTScan, the first tool combining GPT with static analysis for smart contract logic vulnerability detection.
By breaking down each logic vulnerability type into scenarios and properties, GPTScan matches candidate vulnerabilities with GPT.
It effectively detects ground-truth logic vulnerabilities with a recall of over 70%, including 9 new vulnerabilities missed by human auditors.
arXiv Detail & Related papers (2023-08-07T05:48:53Z) - DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT
Models [92.6951708781736]
This work proposes a comprehensive trustworthiness evaluation for large language models with a focus on GPT-4 and GPT-3.5.
We find that GPT models can be easily misled to generate toxic and biased outputs and leak private information.
Our work illustrates a comprehensive trustworthiness evaluation of GPT models and sheds light on the trustworthiness gaps.
arXiv Detail & Related papers (2023-06-20T17:24:23Z) - Black-box Dataset Ownership Verification via Backdoor Watermarking [67.69308278379957]
We formulate the protection of released datasets as verifying whether they are adopted for training a (suspicious) third-party model.
We propose to embed external patterns via backdoor watermarking for the ownership verification to protect them.
Specifically, we exploit poison-only backdoor attacks ($e.g.$, BadNets) for dataset watermarking and design a hypothesis-test-guided method for dataset verification.
arXiv Detail & Related papers (2022-08-04T05:32:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.