Stealing the Decoding Algorithms of Language Models
- URL: http://arxiv.org/abs/2303.04729v4
- Date: Fri, 1 Dec 2023 22:34:34 GMT
- Title: Stealing the Decoding Algorithms of Language Models
- Authors: Ali Naseh, Kalpesh Krishna, Mohit Iyyer, Amir Houmansadr
- Abstract summary: A key component of generating text from modern language models (LM) is the selection and tuning of decoding algorithms.
In this work, we show, for the first time, that an adversary with typical API access to an LM can steal the type and hyper parameters of its decoding algorithms.
Our attack is effective against popular LMs used in text generation APIs, including GPT-2, GPT-3 and GPT-Neo.
- Score: 56.369946232765656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A key component of generating text from modern language models (LM) is the
selection and tuning of decoding algorithms. These algorithms determine how to
generate text from the internal probability distribution generated by the LM.
The process of choosing a decoding algorithm and tuning its hyperparameters
takes significant time, manual effort, and computation, and it also requires
extensive human evaluation. Therefore, the identity and hyperparameters of such
decoding algorithms are considered to be extremely valuable to their owners. In
this work, we show, for the first time, that an adversary with typical API
access to an LM can steal the type and hyperparameters of its decoding
algorithms at very low monetary costs. Our attack is effective against popular
LMs used in text generation APIs, including GPT-2, GPT-3 and GPT-Neo. We
demonstrate the feasibility of stealing such information with only a few
dollars, e.g., $\$0.8$, $\$1$, $\$4$, and $\$40$ for the four versions of
GPT-3.
Related papers
- Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG [57.14250086701313]
We investigate the extent to which modern LMs generate $n$-grams from their training data.
We develop Rusty-DAWG, a novel search tool inspired by indexing of genomic data.
arXiv Detail & Related papers (2024-06-18T21:31:19Z) - The CLRS-Text Algorithmic Reasoning Language Benchmark [48.45201665463275]
CLRS-Text is a textual version of the CLRS benchmark.
CLRS-Text is capable of procedurally generating trace data for thirty diverse, challenging algorithmic tasks.
We fine-tune and evaluate various LMs as generalist executors on this benchmark.
arXiv Detail & Related papers (2024-06-06T16:29:25Z) - GPT-who: An Information Density-based Machine-Generated Text Detector [6.111161457447324]
We propose GPT-who, the first psycholinguistically-inspired domain-agnostic statistical detector.
This detector employs UID-based features to model the unique statistical signature of each Large Language Models (LLMs)-generated and human-generated texts.
We find that GPT-who can distinguish texts generated by very sophisticated LLMs, even when the overlying text is indiscernible.
arXiv Detail & Related papers (2023-10-09T23:06:05Z) - Reverse-Engineering Decoding Strategies Given Blackbox Access to a
Language Generation System [73.52878118434147]
We present methods to reverse-engineer the decoding method used to generate text.
Our ability to discover which decoding strategy was used has implications for detecting generated text.
arXiv Detail & Related papers (2023-09-09T18:19:47Z) - Towards Codable Watermarking for Injecting Multi-bits Information to LLMs [86.86436777626959]
Large language models (LLMs) generate texts with increasing fluency and realism.
Existing watermarking methods are encoding-inefficient and cannot flexibly meet the diverse information encoding needs.
We propose Codable Text Watermarking for LLMs (CTWL) that allows text watermarks to carry multi-bit customizable information.
arXiv Detail & Related papers (2023-07-29T14:11:15Z) - Memorization for Good: Encryption with Autoregressive Language Models [8.645826579841692]
We propose the first symmetric encryption algorithm with autoregressive language models (SELM)
We show that autoregressive LMs can encode arbitrary data into a compact real-valued vector (i.e., encryption) and then losslessly decode the vector to the original message (i.e. decryption) via random subspace optimization and greedy decoding.
arXiv Detail & Related papers (2023-05-15T05:42:34Z) - Paraphrasing evades detectors of AI-generated text, but retrieval is an
effective defense [56.077252790310176]
We present a paraphrase generation model (DIPPER) that can paraphrase paragraphs, condition on surrounding context, and control lexical diversity and content reordering.
Using DIPPER to paraphrase text generated by three large language models (including GPT3.5-davinci-003) successfully evades several detectors, including watermarking.
We introduce a simple defense that relies on retrieving semantically-similar generations and must be maintained by a language model API provider.
arXiv Detail & Related papers (2023-03-23T16:29:27Z) - Contrastive Decoding: Open-ended Text Generation as Optimization [153.35961722855686]
We propose contrastive decoding (CD), a reliable decoding approach.
It is inspired by the fact that the failures of larger LMs are even more prevalent in smaller LMs.
CD requires zero additional training, and produces higher quality text than decoding from the larger LM alone.
arXiv Detail & Related papers (2022-10-27T00:58:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.