Related papers: Few-Shot and Training-Free Review Generation via Conversational Prompting

Few-Shot and Training-Free Review Generation via Conversational Prompting

URL: http://arxiv.org/abs/2509.20805v1
Date: Thu, 25 Sep 2025 06:36:08 GMT
Title: Few-Shot and Training-Free Review Generation via Conversational Prompting
Authors: Genki Kusano,
Abstract summary: Real-world applications often face few-shot and training-free situations.<n>We propose Conversational Prompting, a lightweight method that reformulates user reviews as multi-turn conversations.
Score: 2.0305676256390934
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Personalized review generation helps businesses understand user preferences, yet most existing approaches assume extensive review histories of the target user or require additional model training. Real-world applications often face few-shot and training-free situations, where only a few user reviews are available and fine-tuning is infeasible. It is well known that large language models (LLMs) can address such low-resource settings, but their effectiveness depends on prompt engineering. In this paper, we propose Conversational Prompting, a lightweight method that reformulates user reviews as multi-turn conversations. Its simple variant, Simple Conversational Prompting (SCP), relies solely on the user's own reviews, while the contrastive variant, Contrastive Conversational Prompting (CCP), inserts reviews from other users or LLMs as incorrect replies and then asks the model to correct them, encouraging the model to produce text in the user's style. Experiments on eight product domains and five LLMs showed that the conventional non-conversational prompt often produced reviews similar to those written by random users, based on text-based metrics such as ROUGE-L and BERTScore, and application-oriented tasks like user identity matching and sentiment analysis. In contrast, both SCP and CCP produced reviews much closer to those of the target user, even when each user had only two reviews. CCP brings further improvements when high-quality negative examples are available, whereas SCP remains competitive when such data cannot be collected. These results suggest that conversational prompting offers a practical solution for review generation under few-shot and training-free constraints.

Related papers

Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions [50.70965714314064]
Large Language Models (LLMs) are increasingly serving as personal assistants, where users share complex and diverse preferences over extended interactions.<n>This work proposes RealPref, a benchmark for evaluating realistic preference-following in personalized user-LLM interactions.
arXiv Detail & Related papers (2026-03-04T15:42:43Z)
User Feedback in Human-LLM Dialogues: A Lens to Understand Users But Noisy as a Learning Signal [58.43749783815486]
We study implicit user feedback in two user-LM interaction datasets.<n>We find that the contents of user feedback can improve model performance in short human-designed questions.<n>We also find that the usefulness of user feedback is largely tied to the quality of the user's initial prompt.
arXiv Detail & Related papers (2025-07-30T23:33:29Z)
A Personalized Conversational Benchmark: Towards Simulating Personalized Conversations [112.81207927088117]
PersonaConvBench is a benchmark for evaluating personalized reasoning and generation in multi-turn conversations with large language models (LLMs)<n>We benchmark several commercial and open-source LLMs under a unified prompting setup and observe that incorporating personalized history yields substantial performance improvements.
arXiv Detail & Related papers (2025-05-20T09:13:22Z)
UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering [39.79275025010785]
name is a benchmark designed to evaluate the effectiveness of user embeddings in prompting large language models for personalization.<n>We conduct extensive experiments on various state-of-the-art methods for modeling user embeddings.
arXiv Detail & Related papers (2025-02-26T14:34:00Z)
Tuning-Free Personalized Alignment via Trial-Error-Explain In-Context Learning [74.56097953187994]
We present Trial-Error-Explain In-Context Learning (TICL), a tuning-free method that personalizes language models for text generation tasks.<n>TICL iteratively expands an in-context learning prompt via a trial-error-explain process, adding model-generated negative samples and explanations.<n>TICL achieves up to 91.5% against the previous state-of-the-art and outperforms competitive tuning-free baselines for personalized alignment tasks.
arXiv Detail & Related papers (2025-02-13T05:20:21Z)
Towards Empathetic Conversational Recommender Systems [77.53167131692]
We propose an empathetic conversational recommender (ECR) framework. ECR contains two main modules: emotion-aware item recommendation and emotion-aligned response generation. Our experiments on the ReDial dataset validate the efficacy of our framework in enhancing recommendation accuracy and improving user satisfaction.
arXiv Detail & Related papers (2024-08-30T15:43:07Z)
Aligning LLM Agents by Learning Latent Preference from User Edits [23.235995078727658]
We study interactive learning of language agents based on user edits made to the agent's output. We propose a learning framework, PRELUDE, that infers a description of the user's latent preference based on historic edit data. We introduce two interactive environments -- summarization and email writing, and use a GPT-4 simulated user for evaluation.
arXiv Detail & Related papers (2024-04-23T17:57:47Z)
RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models [17.782410287625645]
This paper proposes a benchmark, RefuteBench, covering tasks such as question answering, machine translation, and email writing. The evaluation aims to assess whether models can positively accept feedback in form of refuting instructions and whether they can consistently adhere to user demands throughout the conversation.
arXiv Detail & Related papers (2024-02-21T01:39:56Z)
Recommendations by Concise User Profiles from Review Text [24.408292545170944]
This work addresses the difficult and underexplored case of users who have very sparse interactions but post informative review texts.<n> feeding the full text of all reviews through an LLM has a weak signal-to-noise ratio and incurs high costs of processed tokens.<n>It presents a light-weight framework, called CUP, which first computes concise user profiles and feeds only these into the training of transformer-based recommenders.
arXiv Detail & Related papers (2023-11-02T15:31:12Z)
Approximating Online Human Evaluation of Social Chatbots with Prompting [11.657633779338724]
Existing evaluation metrics aim to automate offline user evaluation and approximate human judgment of pre-curated dialogs. We propose an approach to approximate online human evaluation leveraging large language models (LLMs) from the GPT family. We introduce a new Dialog system Evaluation framework based on Prompting (DEP), which enables a fully automatic evaluation pipeline.
arXiv Detail & Related papers (2023-04-11T14:45:01Z)
Automating App Review Response Generation [67.58267006314415]
We propose a novel approach RRGen that automatically generates review responses by learning knowledge relations between reviews and their responses. Experiments on 58 apps and 309,246 review-response pairs highlight that RRGen outperforms the baselines by at least 67.4% in terms of BLEU-4.
arXiv Detail & Related papers (2020-02-10T05:23:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.