PROMPT WAYWARDNESS: The Curious Case of Discretized Interpretation of
Continuous Prompts
- URL: http://arxiv.org/abs/2112.08348v1
- Date: Wed, 15 Dec 2021 18:55:05 GMT
- Title: PROMPT WAYWARDNESS: The Curious Case of Discretized Interpretation of
Continuous Prompts
- Authors: Daniel Khashabi, Shane Lyu, Sewon Min, Lianhui Qin, Kyle Richardson,
Sameer Singh, Sean Welleck, Hannaneh Hajishirzi, Tushar Khot, Ashish
Sabharwal, Yejin Choi
- Abstract summary: Fine-tuning continuous prompts for target tasks has emerged as a compact alternative to full model fine-tuning.
In practice, we observe a "wayward" behavior between the task solved by continuous prompts and their nearest neighbor.
- Score: 99.03864962014431
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fine-tuning continuous prompts for target tasks has recently emerged as a
compact alternative to full model fine-tuning. Motivated by these promising
results, we investigate the feasibility of extracting a discrete (textual)
interpretation of continuous prompts that is faithful to the problem they
solve. In practice, we observe a "wayward" behavior between the task solved by
continuous prompts and their nearest neighbor discrete projections: We can find
continuous prompts that solve a task while being projected to an arbitrary text
(e.g., definition of a different or even a contradictory task), while being
within a very small (2%) margin of the best continuous prompt of the same size
for the task. We provide intuitions behind this odd and surprising behavior, as
well as extensive empirical analyses quantifying the effect of various
parameters. For instance, for larger model sizes we observe higher waywardness,
i.e, we can find prompts that more closely map to any arbitrary text with a
smaller drop in accuracy. These findings have important implications relating
to the difficulty of faithfully interpreting continuous prompts and their
generalization across models and tasks, providing guidance for future progress
in prompting language models.
Related papers
- Eliciting Textual Descriptions from Representations of Continuous Prompts [11.489611613744724]
We propose a new approach to interpret continuous prompts that elicits textual descriptions from their representations during model inference.
We show our method often yields accurate task descriptions which become more faithful as task performance increases.
InSPEcT can be leveraged to debug unwanted properties in continuous prompts and inform developers on ways to mitigate them.
arXiv Detail & Related papers (2024-10-15T14:46:11Z) - Semantic Prompting with Image-Token for Continual Learning [7.5140668729696145]
I-Prompt is a task-agnostic approach to eliminate task prediction.
Our method achieves competitive performance on four benchmarks.
We demonstrate the superiority of our method across various scenarios through extensive experiments.
arXiv Detail & Related papers (2024-03-18T07:43:14Z) - Continuous Prompt Generation from Linear Combination of Discrete Prompt
Embeddings [0.0]
We present a novel method of constructing continuous prompts via discrete prompt embeddings and evaluate improvements to continuous prompt interpretability and inference accuracy.
For a set of manually designed discrete prompts $mathcalD$, which we tokenize and embed each into tensor form, we train a model to predict the weights such that the linear combinations of those prompts correspond to higher performance on natural language understanding tasks.
arXiv Detail & Related papers (2023-12-16T05:02:06Z) - Topic-DPR: Topic-based Prompts for Dense Passage Retrieval [6.265789210037749]
We present Topic-DPR, a dense passage retrieval model that uses topic-based prompts.
We introduce a novel positive and negative sampling strategy, leveraging semi-structured data to boost dense retrieval efficiency.
arXiv Detail & Related papers (2023-10-10T13:45:24Z) - Zero-Shot Continuous Prompt Transfer: Generalizing Task Semantics Across Language Models [24.114485240965383]
We propose a zero-shot continuous prompt transfer method, where source prompts are encoded into relative space and the corresponding target prompts are searched for transferring to target models.
Experimental results confirm the effectiveness of our method, showing that 'task semantics' in continuous prompts can be generalized across various language models.
arXiv Detail & Related papers (2023-10-02T23:12:21Z) - Bayesian Prompt Learning for Image-Language Model Generalization [64.50204877434878]
We use the regularization ability of Bayesian methods to frame prompt learning as a variational inference problem.
Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts.
We demonstrate empirically on 15 benchmarks that Bayesian prompt learning provides an appropriate coverage of the prompt space.
arXiv Detail & Related papers (2022-10-05T17:05:56Z) - Task Formulation Matters When Learning Continually: A Case Study in
Visual Question Answering [58.82325933356066]
Continual learning aims to train a model incrementally on a sequence of tasks without forgetting previous knowledge.
We present a detailed study of how different settings affect performance for Visual Question Answering.
arXiv Detail & Related papers (2022-09-30T19:12:58Z) - Probing as Quantifying the Inductive Bias of Pre-trained Representations [99.93552997506438]
We present a novel framework for probing where the goal is to evaluate the inductive bias of representations for a particular task.
We apply our framework to a series of token-, arc-, and sentence-level tasks.
arXiv Detail & Related papers (2021-10-15T22:01:16Z) - Narrative Incoherence Detection [76.43894977558811]
We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding.
Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow.
arXiv Detail & Related papers (2020-12-21T07:18:08Z) - Pareto Probing: Trading Off Accuracy for Complexity [87.09294772742737]
We argue for a probe metric that reflects the fundamental trade-off between probe complexity and performance.
Our experiments with dependency parsing reveal a wide gap in syntactic knowledge between contextual and non-contextual representations.
arXiv Detail & Related papers (2020-10-05T17:27:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.