Related papers: Contrastive Prompting Enhances Sentence Embeddings in LLMs through Inference-Time Steering

Contrastive Prompting Enhances Sentence Embeddings in LLMs through Inference-Time Steering

URL: http://arxiv.org/abs/2505.12831v1
Date: Mon, 19 May 2025 08:19:27 GMT
Title: Contrastive Prompting Enhances Sentence Embeddings in LLMs through Inference-Time Steering
Authors: Zifeng Cheng, Zhonghui Wang, Yuchen Fu, Zhiwei Jiang, Yafeng Yin, Cong Wang, Qing Gu,
Abstract summary: We propose a Contrastive Prompting (CP) method that introduces an extra auxiliary prompt to elicit better sentence embedding.<n>By contrasting with the auxiliary prompt, CP can steer existing prompts to encode the core semantics of the sentence.<n>Our method can improve the performance of existing prompt-based methods across different large language models.
Score: 12.982890198455701
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Extracting sentence embeddings from large language models (LLMs) is a practical direction, as it requires neither additional data nor fine-tuning. Previous studies usually focus on prompt engineering to guide LLMs to encode the core semantic information of the sentence into the embedding of the last token. However, the last token in these methods still encodes an excess of non-essential information, such as stop words, limiting its encoding capacity. To this end, we propose a Contrastive Prompting (CP) method that introduces an extra auxiliary prompt to elicit better sentence embedding. By contrasting with the auxiliary prompt, CP can steer existing prompts to encode the core semantics of the sentence, rather than non-essential information. CP is a plug-and-play inference-time intervention method that can be combined with various prompt-based methods. Extensive experiments on Semantic Textual Similarity (STS) tasks and downstream classification tasks demonstrate that our method can improve the performance of existing prompt-based methods across different LLMs. Our code will be released at https://github.com/zifengcheng/CP.

Related papers

Enhancing Input-Label Mapping in In-Context Learning with Contrastive Decoding [71.01099784480597]
Large language models (LLMs) excel at a range of tasks through in-context learning (ICL)<n>We introduce In-Context Contrastive Decoding (ICCD), a novel method that emphasizes input-label mapping.<n>ICCD emphasizes input-label mapping by contrasting the output distributions between positive and negative in-context examples.
arXiv Detail & Related papers (2025-02-19T14:04:46Z)
Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs [10.213016513358598]
Token Prepending (TP) technique prepends each layer's decoded sentence embedding to the beginning of the sentence in the next layer's input.<n>TP technique is a plug-and-play and training-free technique, which means it can be seamlessly integrated with prompt-based sentence embedding methods.
arXiv Detail & Related papers (2024-12-16T08:42:00Z)
SentenceVAE: Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context [49.9628075245959]
We present Sentence Variational Autoencoder (SentenceVAE), which includes a Sentence to compress multiple tokens in a sentence into a single token, and a Sentence Decoder to reconstruct it. The proposed method can accelerate inference speed by 204365%, reduce perplexity (PPL) to 4675% of its original metric, and decrease memory overhead by 8691% for the equivalent context length.
arXiv Detail & Related papers (2024-08-01T15:45:19Z)
Token-level Correlation-guided Compression for Efficient Multimodal Document Understanding [54.532578213126065]
Most document understanding methods preserve all tokens within sub-images and treat them equally. This neglects their different informativeness and leads to a significant increase in the number of image tokens. We propose Token-level Correlation-guided Compression, a parameter-free and plug-and-play methodology to optimize token processing.
arXiv Detail & Related papers (2024-07-19T16:11:15Z)
Multi-Prompting Decoder Helps Better Language Understanding [23.084538462710125]
We propose a simple yet effective Multi-Prompting Decoder (MPD) framework for MaaS adaptation. Our method achieves new state-of-the-art results on multiple natural language understanding datasets under the few-shot setting.
arXiv Detail & Related papers (2024-06-10T13:58:46Z)
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs [65.2379940117181]
We introduce code prompting, a chain of prompts that transforms a natural language problem into code. We find that code prompting exhibits a high-performance boost for multiple LLMs. Our analysis of GPT 3.5 reveals that the code formatting of the input problem is essential for performance improvement.
arXiv Detail & Related papers (2024-01-18T15:32:24Z)
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding [46.485363806259265]
Speculative Decoding has emerged as a novel decoding paradigm for Large Language Models (LLMs) inference. In each decoding step, this method first drafts several future tokens efficiently and then verifies them in parallel. This paper presents a comprehensive overview and analysis of this promising decoding paradigm.
arXiv Detail & Related papers (2024-01-15T17:26:50Z)
Discrete Prompt Compression with Reinforcement Learning [2.664293070994717]
Compressed prompts aid instruction-tuned language models (LMs) in overcoming context window limitations and reducing computational costs. Existing methods, which primarily based on training embeddings, face various challenges associated with interpretability, the fixed number of embedding tokens, reusability across different LMs, and inapplicability when interacting with black-box APIs. This study proposes prompt compression with reinforcement learning (PCRL), which is a discrete prompt compression method that addresses these issues.
arXiv Detail & Related papers (2023-08-17T03:10:17Z)
Supporting Vision-Language Model Inference with Confounder-pruning Knowledge Prompt [71.77504700496004]
Vision-language models are pre-trained by aligning image-text pairs in a common space to deal with open-set visual concepts. To boost the transferability of the pre-trained models, recent works adopt fixed or learnable prompts. However, how and what prompts can improve inference performance remains unclear.
arXiv Detail & Related papers (2022-05-23T07:51:15Z)
Learning to Ask Conversational Questions by Optimizing Levenshtein Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions. RISE is able to pay attention to tokens that are related to conversational characteristics. Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.