Related papers: PLUM: Adapting Pre-trained Language Models for Industrial-scale Generative Recommendations

PLUM: Adapting Pre-trained Language Models for Industrial-scale Generative Recommendations

URL: http://arxiv.org/abs/2510.07784v1
Date: Thu, 09 Oct 2025 05:01:05 GMT
Title: PLUM: Adapting Pre-trained Language Models for Industrial-scale Generative Recommendations
Authors: Ruining He, Lukasz Heldt, Lichan Hong, Raghunandan Keshavan, Shifan Mao, Nikhil Mehta, Zhengyang Su, Alicia Tsai, Yueqi Wang, Shao-Chuan Wang, Xinyang Yi, Lexi Baugher, Baykal Cakici, Ed Chi, Cristos Goodrow, Ningren Han, He Ma, Romer Rosales, Abby Van Soest, Devansh Tandon, Su-Lin Wu, Weilong Yang, Yilin Zheng,
Abstract summary: We introduce PLUM, a framework designed to adapt pre-trained Large Language Models for industry-scale recommendation tasks.<n>PLUM consists of item tokenization using Semantic IDs, continued pre-training (CPT) on domain-specific data, and task-specific fine-tuning for recommendation objectives.<n>For fine-tuning, we focus particularly on generative retrieval, where the model is directly trained to generate Semantic IDs of recommended items based on user context.
Score: 8.96024282226161
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) pose a new paradigm of modeling and computation for information tasks. Recommendation systems are a critical application domain poised to benefit significantly from the sequence modeling capabilities and world knowledge inherent in these large models. In this paper, we introduce PLUM, a framework designed to adapt pre-trained LLMs for industry-scale recommendation tasks. PLUM consists of item tokenization using Semantic IDs, continued pre-training (CPT) on domain-specific data, and task-specific fine-tuning for recommendation objectives. For fine-tuning, we focus particularly on generative retrieval, where the model is directly trained to generate Semantic IDs of recommended items based on user context. We conduct comprehensive experiments on large-scale internal video recommendation datasets. Our results demonstrate that PLUM achieves substantial improvements for retrieval compared to a heavily-optimized production model built with large embedding tables. We also present a scaling study for the model's retrieval performance, our learnings about CPT, a few enhancements to Semantic IDs, along with an overview of the training and inference methods that enable launching this framework to billions of users in YouTube.

Related papers

Reasoning to Rank: An End-to-End Solution for Exploiting Large Language Models for Recommendation [44.51582748617213]
Reasoning to Rank is an end-to-end training framework that internalizes recommendation utility optimization into the learning of step-by-step reasoning in language models.<n>Our framework performs reasoning at the user-item level and employs reinforcement learning for end-to-end training of the language model.
arXiv Detail & Related papers (2026-02-13T02:22:48Z)
Efficient Model Selection for Time Series Forecasting via LLMs [52.31535714387368]
We propose to leverage Large Language Models (LLMs) as a lightweight alternative for model selection.<n>Our method eliminates the need for explicit performance matrices by utilizing the inherent knowledge and reasoning capabilities of LLMs.
arXiv Detail & Related papers (2025-04-02T20:33:27Z)
Large Language Model as Universal Retriever in Industrial-Scale Recommender System [27.58251380192748]
We show that Large Language Models (LLMs) can function as universal retrievers, capable of handling multiple objectives within a generative retrieval framework.<n>We also introduce matrix decomposition to boost model learnability, discriminability, and transferability.<n>Our Universal Retrieval Model (URM) can adaptively generate a set from computation of tens of millions of candidates.
arXiv Detail & Related papers (2025-02-05T09:56:52Z)
SELF: Surrogate-light Feature Selection with Large Language Models in Deep Recommender Systems [51.09233156090496]
SurrogatE-Light Feature selection method for deep recommender systems.<n> SELF integrates semantic reasoning from Large Language Models with task-specific learning from surrogate models.<n> Comprehensive experiments on three public datasets from real-world recommender platforms validate the effectiveness of SELF.
arXiv Detail & Related papers (2024-12-11T16:28:18Z)
Scaling New Frontiers: Insights into Large Recommendation Models [74.77410470984168]
Meta's generative recommendation model HSTU illustrates the scaling laws of recommendation systems by expanding parameters to thousands of billions.<n>We conduct comprehensive ablation studies to explore the origins of these scaling laws.<n>We offer insights into future directions for large recommendation models.
arXiv Detail & Related papers (2024-12-01T07:27:20Z)
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data. We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z)
EmbedLLM: Learning Compact Representations of Large Language Models [28.49433308281983]
We propose EmbedLLM, a framework designed to learn compact vector representations of Large Language Models. We introduce an encoder-decoder approach for learning such embeddings, along with a systematic framework to evaluate their effectiveness. Empirical results show that EmbedLLM outperforms prior methods in model routing both in accuracy and latency.
arXiv Detail & Related papers (2024-10-03T05:43:24Z)
Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning [13.964106147449051]
Existing solutions concentrate on fine-tuning the pre-trained models on conventional image datasets. We propose a novel and effective framework based on learning Visual Prompts (VPT) in the pre-trained Vision Transformers (ViT) We demonstrate that our new approximations with semantic information are superior to representative capabilities.
arXiv Detail & Related papers (2024-02-04T04:42:05Z)
ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation [43.270424225285105]
We focus on adapting and empowering a pure large language model for zero-shot and few-shot recommendation tasks. We propose Retrieval-enhanced Large Language models (ReLLa) for recommendation tasks in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-08-22T02:25:04Z)
Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations [63.19448893196642]
We present a novel framework that harnesses the rich contextual information and semantic representations provided by large language models to analyze behavior graphs. By leveraging this capability, our framework enables personalized and accurate job recommendations for individual users.
arXiv Detail & Related papers (2023-07-10T11:29:41Z)
A Survey on Large Language Models for Recommendation [77.91673633328148]
Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) This survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec)
arXiv Detail & Related papers (2023-05-31T13:51:26Z)
Self-supervised Learning for Large-scale Item Recommendations [18.19202958502061]
Large scale recommender models find most relevant items from huge catalogs. With millions to billions of items in the corpus, users tend to provide feedback for a very small set of them. We propose a multi-task self-supervised learning framework for large-scale item recommendations.
arXiv Detail & Related papers (2020-07-25T06:21:43Z)
Developing a Recommendation Benchmark for MLPerf Training and Inference [16.471395965484145]
We aim to define an industry-relevant recommendation benchmark for theerferf Training andInference Suites. The paper synthesizes the desirable modeling strategies for personalized recommendation systems. We lay out desirable characteristics of recommendation model architectures and data sets.
arXiv Detail & Related papers (2020-03-16T17:13:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.