Mondrian: Prompt Abstraction Attack Against Large Language Models for
Cheaper API Pricing
- URL: http://arxiv.org/abs/2308.03558v1
- Date: Mon, 7 Aug 2023 13:10:35 GMT
- Title: Mondrian: Prompt Abstraction Attack Against Large Language Models for
Cheaper API Pricing
- Authors: Wai Man Si, Michael Backes, Yang Zhang
- Abstract summary: We propose Mondrian, a simple and straightforward method that abstracts sentences, which can lower the cost of using LLM APIs.
Our results show that Mondrian successfully reduces user queries' token length ranging from 13% to 23% across various tasks.
As a result, the prompt abstraction attack enables the adversary to profit without bearing the cost of API development and deployment.
- Score: 19.76564349397695
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Machine Learning as a Service (MLaaS) market is rapidly expanding and
becoming more mature. For example, OpenAI's ChatGPT is an advanced large
language model (LLM) that generates responses for various queries with
associated fees. Although these models can deliver satisfactory performance,
they are far from perfect. Researchers have long studied the vulnerabilities
and limitations of LLMs, such as adversarial attacks and model toxicity.
Inevitably, commercial ML models are also not exempt from such issues, which
can be problematic as MLaaS continues to grow. In this paper, we discover a new
attack strategy against LLM APIs, namely the prompt abstraction attack.
Specifically, we propose Mondrian, a simple and straightforward method that
abstracts sentences, which can lower the cost of using LLM APIs. In this
approach, the adversary first creates a pseudo API (with a lower established
price) to serve as the proxy of the target API (with a higher established
price). Next, the pseudo API leverages Mondrian to modify the user query,
obtain the abstracted response from the target API, and forward it back to the
end user. Our results show that Mondrian successfully reduces user queries'
token length ranging from 13% to 23% across various tasks, including text
classification, generation, and question answering. Meanwhile, these abstracted
queries do not significantly affect the utility of task-specific and general
language models like ChatGPT. Mondrian also reduces instruction prompts' token
length by at least 11% without compromising output quality. As a result, the
prompt abstraction attack enables the adversary to profit without bearing the
cost of API development and deployment.
Related papers
- ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents [7.166156709980112]
We introduce textscShortcutsBench, a large-scale benchmark for the comprehensive evaluation of API-based agents.
textscShortcutsBench includes a wealth of real APIs from Apple Inc.'s operating systems.
Our evaluation reveals significant limitations in handling complex queries related to API selection, parameter filling, and requesting necessary information from systems and users.
arXiv Detail & Related papers (2024-06-28T08:45:02Z) - A Solution-based LLM API-using Methodology for Academic Information Seeking [49.096714812902576]
SoAy is a solution-based LLM API-using methodology for academic information seeking.
It uses code with a solution as the reasoning method, where a solution is a pre-constructed API calling sequence.
Results show a 34.58-75.99% performance improvement compared to state-of-the-art LLM API-based baselines.
arXiv Detail & Related papers (2024-05-24T02:44:14Z) - LLM+Reasoning+Planning for supporting incomplete user queries in presence of APIs [0.09374652839580183]
In practice, natural language task requests (user queries) are often incomplete, i.e., they may not contain all the information required by the APIs.
We leverage logical reasoning and classical AI planning along with an LLM for accurately answering user queries.
Our approach achieves over 95% success rate in most cases on a dataset containing complete and incomplete single goal and multi-goal queries.
arXiv Detail & Related papers (2024-05-21T01:16:34Z) - Logits of API-Protected LLMs Leak Proprietary Information [46.014638838911566]
We show that it is possible to learn a surprisingly large amount of non-public information about an API-protected LLM from a relatively small number of API queries.
Most modern LLMs suffer from a softmax bottleneck, which restricts the model outputs to a linear subspace of the full output space.
We show that this lends itself to a model image or a model signature which unlocks several capabilities with affordable cost.
arXiv Detail & Related papers (2024-03-14T16:27:49Z) - Make Them Spill the Beans! Coercive Knowledge Extraction from
(Production) LLMs [31.80386572346993]
We exploit the fact that even when an LLM rejects a toxic request, a harmful response often hides deep in the output logits.
This approach differs from and outperforms jail-breaking methods, achieving 92% effectiveness compared to 62%, and is 10 to 20 times faster.
Our findings indicate that interrogation can extract toxic knowledge even from models specifically designed for coding tasks.
arXiv Detail & Related papers (2023-12-08T01:41:36Z) - Leveraging Large Language Models to Improve REST API Testing [51.284096009803406]
RESTGPT takes as input an API specification, extracts machine-interpretable rules, and generates example parameter values from natural-language descriptions in the specification.
Our evaluations indicate that RESTGPT outperforms existing techniques in both rule extraction and value generation.
arXiv Detail & Related papers (2023-12-01T19:53:23Z) - Universal and Transferable Adversarial Attacks on Aligned Language
Models [118.41733208825278]
We propose a simple and effective attack method that causes aligned language models to generate objectionable behaviors.
Surprisingly, we find that the adversarial prompts generated by our approach are quite transferable.
arXiv Detail & Related papers (2023-07-27T17:49:12Z) - Allies: Prompting Large Language Model with Beam Search [107.38790111856761]
In this work, we propose a novel method called ALLIES.
Given an input query, ALLIES leverages LLMs to iteratively generate new queries related to the original query.
By iteratively refining and expanding the scope of the original query, ALLIES captures and utilizes hidden knowledge that may not be directly through retrieval.
arXiv Detail & Related papers (2023-05-24T06:16:44Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.