Instructional Prompt Optimization for Few-Shot LLM-Based Recommendations on Cold-Start Users
- URL: http://arxiv.org/abs/2509.09066v1
- Date: Thu, 11 Sep 2025 00:13:17 GMT
- Title: Instructional Prompt Optimization for Few-Shot LLM-Based Recommendations on Cold-Start Users
- Authors: Haowei Yang, Yushang Zhao, Sitao Min, Bo Su, Chao Yao, Wei Xu,
- Abstract summary: The cold-start user issue further compromises the effectiveness of recommender systems in limiting access to the historical behavioral information.<n>We introduce a context-conditioned prompt formulation method to optimize instructional prompts on a few-shot large language model (LLM)<n>We show that prompt-based adaptation may be considered one of the ways to address cold-start recommendation issues in LLM-based pipelines.
- Score: 12.794692175339668
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The cold-start user issue further compromises the effectiveness of recommender systems in limiting access to the historical behavioral information. It is an effective pipeline to optimize instructional prompts on a few-shot large language model (LLM) used in recommender tasks. We introduce a context-conditioned prompt formulation method P(u,\ Ds)\ \rightarrow\ R\widehat, where u is a cold-start user profile, Ds is a curated support set, and R\widehat is the predicted ranked list of items. Based on systematic experimentation with transformer-based autoregressive LLMs (BioGPT, LLaMA-2, GPT-4), we provide empirical evidence that optimal exemplar injection and instruction structuring can significantly improve the precision@k and NDCG scores of such models in low-data settings. The pipeline uses token-level alignments and embedding space regularization with a greater semantic fidelity. Our findings not only show that timely composition is not merely syntactic but also functional as it is in direct control of attention scales and decoder conduct through inference. This paper shows that prompt-based adaptation may be considered one of the ways to address cold-start recommendation issues in LLM-based pipelines.
Related papers
- Efficient Cold-Start Recommendation via BPE Token-Level Embedding Initialization with LLM [1.1049570075807806]
This paper presents an efficient cold-start recommendation strategy based on subword-level representations.<n>We obtain fine-grained token-level vectors that are aligned with the BPE vocabulary.<n>We show that using subword-aware embeddings yields better generalizability and is more interpretable.
arXiv Detail & Related papers (2025-09-16T15:32:51Z) - RLHF Fine-Tuning of LLMs for Alignment with Implicit User Feedback in Conversational Recommenders [0.8246494848934447]
We propose a fine-tuning solution using human feedback reinforcement learning (RLHF) to maximize implied user feedback (IUF) in a multi-turn recommendation context.<n>We show that our RLHF-fine-tuned models can perform better in terms of top-$k$ recommendation accuracy, coherence, and user satisfaction compared to (arrow-zero-cmwrquca-teja-falset ensuite 2Round group-deca States penalty give up.
arXiv Detail & Related papers (2025-08-07T11:36:55Z) - PITA: Preference-Guided Inference-Time Alignment for LLM Post-Training [9.093854840532062]
PITA is a novel framework that integrates preference feedback directly into the LLM's token generation.<n> PITA learns a small preference-based guidance policy to modify token probabilities at inference time without fine-tuning.<n>We evaluate PITA across diverse tasks, including mathematical reasoning and sentiment classification.
arXiv Detail & Related papers (2025-07-26T21:46:32Z) - Revisiting Prompt Engineering: A Comprehensive Evaluation for LLM-based Personalized Recommendation [2.3650193864974978]
Large language models (LLMs) can perform recommendation tasks by taking prompts written in natural language as input.<n>This paper focuses on a single-user setting, where no information from other users is used.
arXiv Detail & Related papers (2025-07-17T20:26:00Z) - Debiasing Online Preference Learning via Preference Feature Preservation [64.55924745257951]
Recent preference learning frameworks simplify human preferences with binary pairwise comparisons and scalar rewards.<n>This could make large language models' responses biased to mostly preferred features, and would be exacerbated during the iterations of online preference learning steps.<n>We propose Preference Feature Preservation to maintain the distribution of human preference features and utilize such rich signals throughout the online preference learning process.
arXiv Detail & Related papers (2025-06-06T13:19:07Z) - Training Large Recommendation Models via Graph-Language Token Alignment [53.3142545812349]
We propose a novel framework to train Large Recommendation models via Graph-Language Token Alignment.<n>By aligning item and user nodes from the interaction graph with pretrained LLM tokens, GLTA effectively leverages the reasoning abilities of LLMs.<n> Furthermore, we introduce Graph-Language Logits Matching (GLLM) to optimize token alignment for end-to-end item prediction.
arXiv Detail & Related papers (2025-02-26T02:19:10Z) - In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z) - Laser: Parameter-Efficient LLM Bi-Tuning for Sequential Recommendation with Collaborative Information [76.62949982303532]
We propose a parameter-efficient Large Language Model Bi-Tuning framework for sequential recommendation with collaborative information (Laser)
In our Laser, the prefix is utilized to incorporate user-item collaborative information and adapt the LLM to the recommendation task, while the suffix converts the output embeddings of the LLM from the language space to the recommendation space for the follow-up item recommendation.
M-Former is a lightweight MoE-based querying transformer that uses a set of query experts to integrate diverse user-specific collaborative information encoded by frozen ID-based sequential recommender systems.
arXiv Detail & Related papers (2024-09-03T04:55:03Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z) - A Multi-Strategy based Pre-Training Method for Cold-Start Recommendation [28.337475919795008]
Cold-start problem is a fundamental challenge for recommendation tasks.
Recent self-supervised learning (SSL) on Graph Neural Networks (GNNs) model, PT-GNN, pre-trains the GNN model to reconstruct the cold-start embeddings.
We propose a multi-strategy based pre-training method for cold-start recommendation (MPT), which extends PT-GNN from the perspective of model architecture and pretext tasks.
arXiv Detail & Related papers (2021-12-04T08:11:55Z) - Learning to Learn a Cold-start Sequential Recommender [70.5692886883067]
The cold-start recommendation is an urgent problem in contemporary online applications.
We propose a meta-learning based cold-start sequential recommendation framework called metaCSR.
metaCSR holds the ability to learn the common patterns from regular users' behaviors.
arXiv Detail & Related papers (2021-10-18T08:11:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.