Do Prompts Reshape Representations? An Empirical Study of Prompting Effects on Embeddings
- URL: http://arxiv.org/abs/2510.19694v1
- Date: Wed, 22 Oct 2025 15:43:40 GMT
- Title: Do Prompts Reshape Representations? An Empirical Study of Prompting Effects on Embeddings
- Authors: Cesar Gonzalez-Gutierrez, Dirk Hovy,
- Abstract summary: We study the relationship between prompting and the quality of internal representations.<n>Our findings challenge the assumption that more relevant prompts necessarily lead to better representations.
- Score: 24.17559821473242
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prompting is a common approach for leveraging LMs in zero-shot settings. However, the underlying mechanisms that enable LMs to perform diverse tasks without task-specific supervision remain poorly understood. Studying the relationship between prompting and the quality of internal representations can shed light on how pre-trained embeddings may support in-context task solving. In this empirical study, we conduct a series of probing experiments on prompt embeddings, analyzing various combinations of prompt templates for zero-shot classification. Our findings show that while prompting affects the quality of representations, these changes do not consistently correlate with the relevance of the prompts to the target task. This result challenges the assumption that more relevant prompts necessarily lead to better representations. We further analyze potential factors that may contribute to this unexpected behavior.
Related papers
- Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMs [100.02824137397464]
We investigate how Large Language Models adapt their internal representations when encountering inputs of increasing difficulty.<n>We reveal a consistent and quantifiable phenomenon: as task difficulty increases, the last hidden states of LLMs become substantially sparser.<n>This sparsity--difficulty relation is observable across diverse models and domains.
arXiv Detail & Related papers (2026-03-03T18:48:15Z) - Revisiting Prompt Sensitivity in Large Language Models for Text Classification: The Role of Prompt Underspecification [3.2059646106414967]
Large language models (LLMs) are widely used as zero-shot and few-shot classifiers.<n>We study and compare the sensitivity of underspecified prompts and prompts that provide specific instructions.<n>We find that underspecified prompts exhibit higher performance variance and lower logit values for relevant tokens, while instruction-prompts suffer less from such problems.
arXiv Detail & Related papers (2026-02-04T07:59:28Z) - Evaluating Generalization and Representation Stability in Small LMs via Prompting, Fine-Tuning and Out-of-Distribution Prompts [2.377892000761193]
We investigate the generalization capabilities of small language models under two popular adaptation paradigms: few-shot prompting and supervised fine-tuning.<n>Our findings highlight critical differences in how small models internalize and generalize knowledge under different adaptation strategies.
arXiv Detail & Related papers (2025-06-16T01:44:26Z) - IntCoOp: Interpretability-Aware Vision-Language Prompt Tuning [94.52149969720712]
IntCoOp learns to jointly align attribute-level inductive biases and class embeddings during prompt-tuning.
IntCoOp improves CoOp by 7.35% in average performance across 10 diverse datasets.
arXiv Detail & Related papers (2024-06-19T16:37:31Z) - On the Role of Attention in Prompt-tuning [90.97555030446563]
We study prompt-tuning for one-layer attention architectures and study contextual mixture-models.
We show that softmax-prompt-attention is provably more expressive than softmax-self-attention and linear-prompt-attention.
We also provide experiments that verify our theoretical insights on real datasets and demonstrate how prompt-tuning enables the model to attend to context-relevant information.
arXiv Detail & Related papers (2023-06-06T06:23:38Z) - Fairness-guided Few-shot Prompting for Large Language Models [93.05624064699965]
In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats.
We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes.
We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
arXiv Detail & Related papers (2023-03-23T12:28:25Z) - Task Formulation Matters When Learning Continually: A Case Study in
Visual Question Answering [58.82325933356066]
Continual learning aims to train a model incrementally on a sequence of tasks without forgetting previous knowledge.
We present a detailed study of how different settings affect performance for Visual Question Answering.
arXiv Detail & Related papers (2022-09-30T19:12:58Z) - PROMPT WAYWARDNESS: The Curious Case of Discretized Interpretation of
Continuous Prompts [99.03864962014431]
Fine-tuning continuous prompts for target tasks has emerged as a compact alternative to full model fine-tuning.
In practice, we observe a "wayward" behavior between the task solved by continuous prompts and their nearest neighbor.
arXiv Detail & Related papers (2021-12-15T18:55:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.