Related papers: What's the Magic Word? A Control Theory of LLM Prompting

What's the Magic Word? A Control Theory of LLM Prompting

URL: http://arxiv.org/abs/2310.04444v4
Date: Wed, 3 Jul 2024 22:23:50 GMT
Title: What's the Magic Word? A Control Theory of LLM Prompting
Authors: Aman Bhargava, Cameron Witkowski, Shi-Zhuo Looi, Matt Thomson,
Abstract summary: We explore prompt engineering through the lens of control theory. We present empirical results on the controllability of a panel of LLMs. Short prompt sequences can dramatically alter the likelihood of specific outputs.
Score: 0.7499722271664144
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prompt engineering is crucial for deploying LLMs but is poorly understood mathematically. We formalize LLM systems as a class of discrete stochastic dynamical systems to explore prompt engineering through the lens of control theory. We offer a mathematical analysis of the limitations on the controllability of self-attention as a function of the singular values of the parameter matrices. We present complementary empirical results on the controllability of a panel of LLMs, including Falcon-7b, Llama-7b, and Falcon-40b. Given initial state $\mathbf x_0$ from Wikitext and prompts of length $k \leq 10$ tokens, we find that the "correct" next token is reachable at least 97% of the time, and that the top 75 most likely next tokens are reachable at least 85% of the time. Intriguingly, short prompt sequences can dramatically alter the likelihood of specific outputs, even making the least likely tokens become the most likely ones. This control-theoretic analysis of LLMs demonstrates the significant and poorly understood role of input sequences in steering output probabilities, offering a foundational perspective for enhancing language model system capabilities.

Related papers

Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs [63.29737699997859]
Large Language Models (LLMs) have demonstrated impressive performance on multimodal tasks, without any multimodal finetuning. In this work, we expose frozen LLMs to image, video, audio and text inputs and analyse their internal representation.
arXiv Detail & Related papers (2024-05-26T21:31:59Z)
Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL [78.80673954827773]
Large Language Models (LLMs) play a crucial role in capturing structured semantics to enhance language understanding, improve interpretability, and reduce bias. We propose using Semantic Role Labeling (SRL) as a fundamental task to explore LLMs' ability to extract structured semantics. We find interesting potential: LLMs can indeed capture semantic structures, and scaling-up doesn't always mirror potential. We are surprised to discover that significant overlap in the errors is made by both LLMs and untrained humans, accounting for almost 30% of all errors.
arXiv Detail & Related papers (2024-05-10T11:44:05Z)
Can Large Language Models Play Games? A Case Study of A Self-Play Approach [61.15761840203145]
Large Language Models (LLMs) harness extensive data from the Internet, storing a broad spectrum of prior knowledge. Monte-Carlo Tree Search (MCTS) is a search algorithm that provides reliable decision-making solutions. This work introduces an innovative approach that bolsters LLMs with MCTS self-play to efficiently resolve turn-based zero-sum games.
arXiv Detail & Related papers (2024-03-08T19:16:29Z)
IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact [46.32830393597601]
Large language models (LLMs) excel in natural language processing but demand intensive computation. This paper unveils a previously overlooked type of outliers in LLMs. We propose IntactKV to generate the KV cache of pivot tokens losslessly from the full-precision model.
arXiv Detail & Related papers (2024-03-02T16:05:26Z)
A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models [65.86149763739141]
We introduce LogicAsker, an automatic approach that comprehensively evaluates and improves the logical reasoning abilities of LLMs. We evaluate LogicAsker on six widely deployed LLMs, including GPT-3, ChatGPT, GPT-4, Bard, Vicuna, and Guanaco. The results show that test cases from LogicAsker can find logical reasoning failures in different LLMs with a rate of 25% - 94%.
arXiv Detail & Related papers (2024-01-01T13:53:53Z)
Characterizing Large Language Model Geometry Helps Solve Toxicity Detection and Generation [15.77263269398368]
Large Language Models (LLMs) drive current AI breakthroughs. We shed the light on LLMs inner mechanisms through the lens of geometry. We derive interpretable geometrical features that can be extracted from any (pre-trained) LLM.
arXiv Detail & Related papers (2023-12-04T06:01:32Z)
Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve [21.55766758950951]
We make predictions about the strategies that large language models will adopt to solve next-word prediction tasks. We evaluate two LLMs on eleven tasks and find robust evidence that LLMs are influenced by probability. We conclude that we should not evaluate LLMs as if they are humans but should instead treat them as a distinct type of system.
arXiv Detail & Related papers (2023-09-24T13:35:28Z)
On Learning Latent Models with Multi-Instance Weak Supervision [57.18649648182171]
We consider a weakly supervised learning scenario where the supervision signal is generated by a transition function $sigma$ labels associated with multiple input instances. Our problem is met in different fields, including latent structural learning and neuro-symbolic integration.
arXiv Detail & Related papers (2023-06-23T22:05:08Z)
Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study [44.39031420687302]
Large language models (LLMs) are becoming attractive as few-shot reasoners to solve Natural Language (NL)-related tasks. We try to understand this by designing a benchmark to evaluate the structural understanding capabilities of LLMs. We propose $textitself-augmentation$ for effective structural prompting, such as critical value / range identification.
arXiv Detail & Related papers (2023-05-22T14:23:46Z)
SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs) We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer. We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.