Related papers: Optimizing Recommendations using Fine-Tuned LLMs

Optimizing Recommendations using Fine-Tuned LLMs

URL: http://arxiv.org/abs/2505.06841v1
Date: Sun, 11 May 2025 04:53:34 GMT
Title: Optimizing Recommendations using Fine-Tuned LLMs
Authors: Prabhdeep Cheema, Erhan Guven,
Abstract summary: This paper proposes an approach that generates synthetic datasets by modeling real-world user interactions.<n>It allows users to express more information with complex preferences, such as mood, plot details, and thematic elements.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As digital media platforms strive to meet evolving user expectations, delivering highly personalized and intuitive movies and media recommendations has become essential for attracting and retaining audiences. Traditional systems often rely on keyword-based search and recommendation techniques, which limit users to specific keywords and a combination of keywords. This paper proposes an approach that generates synthetic datasets by modeling real-world user interactions, creating complex chat-style data reflective of diverse preferences. This allows users to express more information with complex preferences, such as mood, plot details, and thematic elements, in addition to conventional criteria like genre, title, and actor-based searches. In today's search space, users cannot write queries like ``Looking for a fantasy movie featuring dire wolves, ideally set in a harsh frozen world with themes of loyalty and survival.'' Building on these contributions, we evaluate synthetic datasets for diversity and effectiveness in training and benchmarking models, particularly in areas often absent from traditional datasets. This approach enhances personalization and accuracy by enabling expressive and natural user queries. It establishes a foundation for the next generation of conversational AI-driven search and recommendation systems in digital entertainment.

Related papers

Synthetic Interaction Data for Scalable Personalization in Large Language Models [67.31884245564086]
We introduce a high-fidelity synthetic data generation framework called PersonaGym.<n>Unlike prior work that treats personalization as static persona-preference pairs, PersonaGym models a dynamic preference process.<n>We release PersonaAtlas, a large-scale, high-quality, and diverse synthetic dataset of high-fidelity multi-turn personalized interaction trajectories.
arXiv Detail & Related papers (2026-02-12T20:41:22Z)
Reasoning-Based Personalized Generation for Users with Sparse Data [120.94029850012045]
We introduce GraSPer, a novel framework for enhancing personalized text generation under sparse context.<n>GraSPer first augments user context by predicting items that the user would likely interact with in the future.<n>With reasoning alignment, it then generates texts for these interactions to enrich the augmented context.<n>In the end, it generates personalized outputs conditioned on both the real and synthetic histories.
arXiv Detail & Related papers (2026-01-31T01:54:23Z)
Towards Context-aware Reasoning-enhanced Generative Searching in E-commerce [61.03081096959132]
We propose a context-aware reasoning-enhanced generative search framework for better textbfunderstanding the complicated context.<n>Our approach achieves superior performance compared with strong baselines, validating its effectiveness for search-based recommendation.
arXiv Detail & Related papers (2025-10-19T16:46:11Z)
Investigating Thematic Patterns and User Preferences in LLM Interactions using BERTopic [4.087884819027264]
This study applies BERTopic to the lmsys-chat-1m dataset, a multilingual conversational corpus built from head-to-head evaluations of large language models (LLMs)<n>The main objective is uncovering thematic patterns in these conversations and examining their relation to user preferences.<n>We analysed relationships between topics and model preferences to identify trends in model-topic alignment.
arXiv Detail & Related papers (2025-10-08T21:13:44Z)
Agentic Personalized Fashion Recommendation in the Age of Generative AI: Challenges, Opportunities, and Evaluation [9.319920301747297]
This paper synthesizes both academic and industrial viewpoints to map the distinctive output space and stakeholder ecosystem of modern FaRS.<n>We propose an Agentic Mixed-Modality Refinement pipeline, which fuses multimodal encoders with agentic LLM planners and dynamic retrieval.
arXiv Detail & Related papers (2025-08-04T12:22:25Z)
From Intent Discovery to Recognition with Topic Modeling and Synthetic Data [0.0]
Customer utterances are characterized by infrequent word co-occurences and high term variability.<n>We propose an agentic LLM framework for topic modeling and synthetic query generation.<n>We show that LLM-generated intent descriptions and keywords can effectively substitute for human-curated versions.
arXiv Detail & Related papers (2025-05-16T12:20:31Z)
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users [111.56469697145519]
We propose Few-Shot Preference Optimization, which reframes reward modeling as a meta-learning problem.<n>Under this framework, an LLM learns to quickly adapt to a user via a few labeled preferences from that user, constructing a personalized reward function for them.<n>We generate over 1M synthetic personalized preferences using publicly available LLMs.<n>We evaluate FSPO on personalized open-ended generation for up to 1,500 synthetic users across three domains: movie reviews, pedagogical adaptation based on educational background, and general question answering, along with a controlled human study.
arXiv Detail & Related papers (2025-02-26T17:08:46Z)
Personalized Multimodal Large Language Models: A Survey [127.9521218125761]
Multimodal Large Language Models (MLLMs) have become increasingly important due to their state-of-the-art performance and ability to integrate multiple data modalities.<n>This paper presents a comprehensive survey on personalized multimodal large language models, focusing on their architecture, training methods, and applications.
arXiv Detail & Related papers (2024-12-03T03:59:03Z)
WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback [36.06000681394939]
We introduce WildFeedback, a novel framework that leverages in-situ user feedback during conversations with large language models (LLMs) to create preference datasets automatically.<n>Our experiments demonstrate that LLMs fine-tuned on WildFeedback dataset exhibit significantly improved alignment with user preferences.
arXiv Detail & Related papers (2024-08-28T05:53:46Z)
Towards Realistic Synthetic User-Generated Content: A Scaffolding Approach to Generating Online Discussions [17.96479268328824]
We investigate the feasibility of creating realistic, large-scale synthetic datasets of user-generated content. We propose a multi-step generation process, predicated on the idea of creating compact representations of discussion threads.
arXiv Detail & Related papers (2024-08-15T18:43:50Z)
Retrieval Augmentation via User Interest Clustering [57.63883506013693]
Industrial recommender systems are sensitive to the patterns of user-item engagement. We propose a novel approach that efficiently constructs user interest and facilitates low computational cost inference. Our approach has been deployed in multiple products at Meta, facilitating short-form video related recommendation.
arXiv Detail & Related papers (2024-08-07T16:35:10Z)
Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond [87.1712108247199]
Our goal is to establish a Unified paradigm for Multi-modal Personalization systems (UniMP) We develop a generic and personalization generative framework, that can handle a wide range of personalized needs. Our methodology enhances the capabilities of foundational language models for personalized tasks.
arXiv Detail & Related papers (2024-03-15T20:21:31Z)
InteraRec: Screenshot Based Recommendations Using Multimodal Large Language Models [0.6926105253992517]
We introduce a sophisticated and interactive recommendation framework denoted as InteraRec. InteraRec captures high-frequency screenshots of web pages as users navigate through a website. We demonstrate the effectiveness of InteraRec in providing users with valuable and personalized offerings.
arXiv Detail & Related papers (2024-02-26T17:47:57Z)
AUGUST: an Automatic Generation Understudy for Synthesizing Conversational Recommendation Datasets [56.052803235932686]
We propose a novel automatic dataset synthesis approach that can generate both large-scale and high-quality recommendation dialogues. In doing so, we exploit: (i) rich personalized user profiles from traditional recommendation datasets, (ii) rich external knowledge from knowledge graphs, and (iii) the conversation ability contained in human-to-human conversational recommendation datasets.
arXiv Detail & Related papers (2023-06-16T05:27:14Z)
Talk the Walk: Synthetic Data Generation for Conversational Music Recommendation [62.019437228000776]
We present TalkWalk, which generates realistic high-quality conversational data by leveraging encoded expertise in widely available item collections. We generate over one million diverse conversations in a human-collected dataset.
arXiv Detail & Related papers (2023-01-27T01:54:16Z)
Discovering Personalized Semantics for Soft Attributes in Recommender Systems using Concept Activation Vectors [34.56323846959459]
Interactive recommender systems allow users to express intent, preferences, constraints, and contexts in a richer fashion. One challenge is inferring a user's semantic intent from the open-ended terms or attributes often used to describe a desired item. We develop a framework to learn a representation that captures the semantics of such attributes and connects them to user preferences and behaviors in recommender systems.
arXiv Detail & Related papers (2022-02-06T18:45:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.