Selecting User Histories to Generate LLM Users for Cold-Start Item Recommendation
- URL: http://arxiv.org/abs/2511.21989v1
- Date: Thu, 27 Nov 2025 00:17:21 GMT
- Title: Selecting User Histories to Generate LLM Users for Cold-Start Item Recommendation
- Authors: Nachiket Subbaraman, Jaskinder Sarai, Aniruddh Nath, Lichan Hong, Lukasz Heldt, Li Wei, Zhe Zhao,
- Abstract summary: We develop a reinforcement learning framework that trains a policy to select users for augmentation.<n> Experiments on Amazon Product Review datasets show substantial gains in cold-start item recall.
- Score: 7.689185348031334
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning, generalization, and simulating human-like behavior across a wide range of tasks. These strengths present new opportunities to enhance traditional recommendation systems (RS), especially in the cold-start item scenario where newly introduced items lack interactions. Existing works have used LLMs to address cold-start issues in traditional RS through data augmentation, but they have limitations. One recent work directly addresses this issue by prompting LLMs to generate augmented interaction data between randomly sampled users and cold-start items. Then, they train the traditional RS with augmented data, incorporating collaborative signals for cold-start items. Although they use LLMs to provide cold-start items with feedback, they use partial user histories, which does not allow the LLM to fully emulate the user. Furthermore, randomly selecting users is not optimal for augmentation. To address these challenges, we leverage the LLM as a user and develop a reinforcement learning (RL) framework that trains a policy to select users for augmentation, optimizing for cold-start item performance after augmented training. The policy model learns to select users for cold-start item data augmentation based on their behavioral features and histories. To optimize user selection for cold-start item performance, we employ a policy gradient method that updates the policy in the direction of actions that lead to high rewards. Experiments on Amazon Product Review datasets show substantial gains in cold-start item recall, demonstrating the effectiveness of our method as a scalable, serving-efficient augmentation strategy for modern RS.
Related papers
- Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions [50.70965714314064]
Large Language Models (LLMs) are increasingly serving as personal assistants, where users share complex and diverse preferences over extended interactions.<n>This work proposes RealPref, a benchmark for evaluating realistic preference-following in personalized user-LLM interactions.
arXiv Detail & Related papers (2026-03-04T15:42:43Z) - Are Large Language Models Really Effective for Training-Free Cold-Start Recommendation? [3.446483216812751]
This study focuses on training-free recommendation, where no task-specific training is performed.<n>Large language models (LLMs) have recently been explored as a promising solution, and numerous studies have been proposed.<n>We present the first controlled experiments that systematically evaluate these two approaches in the same setting.
arXiv Detail & Related papers (2025-12-15T05:47:07Z) - DeepRec: Towards a Deep Dive Into the Item Space with Large Language Model Based Recommendation [83.21140655248624]
Large language models (LLMs) have been introduced into recommender systems (RSs)<n>We propose DeepRec, a novel LLM-based RS that enables autonomous multi-turn interactions between LLMs and TRMs for deep exploration of the item space.<n> Experiments on public datasets demonstrate that DeepRec significantly outperforms both traditional and LLM-based baselines.
arXiv Detail & Related papers (2025-05-22T15:49:38Z) - FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users [111.56469697145519]
We propose Few-Shot Preference Optimization, which reframes reward modeling as a meta-learning problem.<n>Under this framework, an LLM learns to quickly adapt to a user via a few labeled preferences from that user, constructing a personalized reward function for them.<n>We generate over 1M synthetic personalized preferences using publicly available LLMs.<n>We evaluate FSPO on personalized open-ended generation for up to 1,500 synthetic users across three domains: movie reviews, pedagogical adaptation based on educational background, and general question answering, along with a controlled human study.
arXiv Detail & Related papers (2025-02-26T17:08:46Z) - GenUP: Generative User Profilers as In-Context Learners for Next POI Recommender Systems [8.789624590579903]
Point-of-Interest (POI) recommendation systems often lack transparency, interpretability, and scrutability.<n>Existing methods often address this by leveraging similar trajectories from other users.<n>We propose a method that generates natural language (NL) user profiles from large-scale, location-based social network (LBSN) check-ins.
arXiv Detail & Related papers (2024-10-28T00:39:22Z) - LLM-ESR: Large Language Models Enhancement for Long-tailed Sequential Recommendation [58.04939553630209]
In real-world systems, most users interact with only a handful of items, while the majority of items are seldom consumed.
These two issues, known as the long-tail user and long-tail item challenges, often pose difficulties for existing Sequential Recommendation systems.
We propose the Large Language Models Enhancement framework for Sequential Recommendation (LLM-ESR) to address these challenges.
arXiv Detail & Related papers (2024-05-31T07:24:42Z) - Large Language Models meet Collaborative Filtering: An Efficient All-round LLM-based Recommender System [19.8986219047121]
Collaborative filtering recommender systems (CF-RecSys) have shown successive results in enhancing the user experience on social media and e-commerce platforms.
Recent strategies have focused on leveraging modality information of user/items based on pre-trained modality encoders and Large Language Models.
We propose an efficient All-round LLM-based Recommender system, called A-LLMRec, that excels not only in the cold scenario but also in the warm scenario.
arXiv Detail & Related papers (2024-04-17T13:03:07Z) - Large Language Model Simulator for Cold-Start Recommendation [45.34030399042562]
Cold items rely solely on content features, limiting their recommendation performance and impacting user experience and revenue.<n>Current models generate synthetic behavioral embeddings from content features but fail to address the core issue: the absence of historical behavior data.<n>We introduce the LLM Simulator framework, which leverages large language models to simulate user interactions for cold items.
arXiv Detail & Related papers (2024-02-14T13:45:06Z) - Learning to Learn a Cold-start Sequential Recommender [70.5692886883067]
The cold-start recommendation is an urgent problem in contemporary online applications.
We propose a meta-learning based cold-start sequential recommendation framework called metaCSR.
metaCSR holds the ability to learn the common patterns from regular users' behaviors.
arXiv Detail & Related papers (2021-10-18T08:11:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.