IDGenRec: LLM-RecSys Alignment with Textual ID Learning
- URL: http://arxiv.org/abs/2403.19021v2
- Date: Fri, 17 May 2024 04:05:18 GMT
- Title: IDGenRec: LLM-RecSys Alignment with Textual ID Learning
- Authors: Juntao Tan, Shuyuan Xu, Wenyue Hua, Yingqiang Ge, Zelong Li, Yongfeng Zhang,
- Abstract summary: We propose IDGen, representing each item as a unique, concise, semantically rich, platform-agnostic textual ID.
We show that IDGen consistently surpasses existing models in sequential recommendation under standard experimental setting.
Results show that the zero-shot performance of the pre-trained foundation model is comparable to or even better than some traditional recommendation models.
- Score: 48.018397048791115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative recommendation based on Large Language Models (LLMs) have transformed the traditional ranking-based recommendation style into a text-to-text generation paradigm. However, in contrast to standard NLP tasks that inherently operate on human vocabulary, current research in generative recommendations struggles to effectively encode recommendation items within the text-to-text framework using concise yet meaningful ID representations. To better align LLMs with recommendation needs, we propose IDGen, representing each item as a unique, concise, semantically rich, platform-agnostic textual ID using human language tokens. This is achieved by training a textual ID generator alongside the LLM-based recommender, enabling seamless integration of personalized recommendations into natural language generation. Notably, as user history is expressed in natural language and decoupled from the original dataset, our approach suggests the potential for a foundational generative recommendation model. Experiments show that our framework consistently surpasses existing models in sequential recommendation under standard experimental setting. Then, we explore the possibility of training a foundation recommendation model with the proposed method on data collected from 19 different datasets and tested its recommendation performance on 6 unseen datasets across different platforms under a completely zero-shot setting. The results show that the zero-shot performance of the pre-trained foundation model is comparable to or even better than some traditional recommendation models based on supervised training, showing the potential of the IDGen paradigm serving as the foundation model for generative recommendation. Code and data are open-sourced at https://github.com/agiresearch/IDGenRec.
Related papers
- A Review of Modern Recommender Systems Using Generative Models (Gen-RecSys) [57.30228361181045]
This survey connects key advancements in recommender systems using Generative Models (Gen-RecSys)
It covers: interaction-driven generative models; the use of large language models (LLM) and textual data for natural language recommendation; and the integration of multimodal models for generating and processing images/videos in RS.
Our work highlights necessary paradigms for evaluating the impact and harm of Gen-RecSys and identifies open challenges.
arXiv Detail & Related papers (2024-03-31T06:57:57Z) - LlamaRec: Two-Stage Recommendation using Large Language Models for
Ranking [10.671747198171136]
We propose a two-stage framework using large language models for ranking-based recommendation (LlamaRec)
In particular, we use small-scale sequential recommenders to retrieve candidates based on the user interaction history.
LlamaRec consistently achieves datasets superior performance in both recommendation performance and efficiency.
arXiv Detail & Related papers (2023-10-25T06:23:48Z) - ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation [43.270424225285105]
We focus on adapting and empowering a pure large language model for zero-shot and few-shot recommendation tasks.
We propose Retrieval-enhanced Large Language models (ReLLa) for recommendation tasks in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-08-22T02:25:04Z) - GenRec: Large Language Model for Generative Recommendation [41.22833600362077]
This paper presents an innovative approach to recommendation systems using large language models (LLMs) based on text data.
GenRec uses LLM's understanding ability to interpret context, learn user preferences, and generate relevant recommendation.
Our research underscores the potential of LLM-based generative recommendation in revolutionizing the domain of recommendation systems.
arXiv Detail & Related papers (2023-07-02T02:37:07Z) - ReGen: Zero-Shot Text Classification via Training Data Generation with
Progressive Dense Retrieval [22.882301169283323]
We propose a retrieval-enhanced framework to create training data from a general-domain unlabeled corpus.
Experiments on nine datasets demonstrate that REGEN achieves 4.3% gain over the strongest baselines and saves around 70% of the time compared to baselines using large NLG models.
arXiv Detail & Related papers (2023-05-18T04:30:09Z) - Recommendation as Instruction Following: A Large Language Model
Empowered Recommendation Approach [83.62750225073341]
We consider recommendation as instruction following by large language models (LLMs)
We first design a general instruction format for describing the preference, intention, task form and context of a user in natural language.
Then we manually design 39 instruction templates and automatically generate a large amount of user-personalized instruction data.
arXiv Detail & Related papers (2023-05-11T17:39:07Z) - How to Index Item IDs for Recommendation Foundation Models [49.425959632372425]
Recommendation foundation model utilizes large language models (LLM) for recommendation by converting recommendation tasks into natural language tasks.
To avoid generating excessively long text and hallucinated recommendations, creating LLM-compatible item IDs is essential.
We propose four simple yet effective solutions, including sequential indexing, collaborative indexing, semantic (content-based) indexing, and hybrid indexing.
arXiv Detail & Related papers (2023-05-11T05:02:37Z) - Recommender Systems with Generative Retrieval [58.454606442670034]
We propose a novel generative retrieval approach, where the retrieval model autoregressively decodes the identifiers of the target candidates.
To that end, we create semantically meaningful of codewords to serve as a Semantic ID for each item.
We show that recommender systems trained with the proposed paradigm significantly outperform the current SOTA models on various datasets.
arXiv Detail & Related papers (2023-05-08T21:48:17Z) - GEMv2: Multilingual NLG Benchmarking in a Single Line of Code [161.1761414080574]
Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers.
GEMv2 supports 40 documented datasets in 51 languages.
Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.
arXiv Detail & Related papers (2022-06-22T17:52:30Z) - Finetuning Large-Scale Pre-trained Language Models for Conversational
Recommendation with Knowledge Graph [35.033130888779226]
We present a pre-trained language model (PLM) based framework called RID conversational recommender system (CRS)
RID significantly outperforms the state-of-the-art methods on both evaluations of dialogue and recommendation.
arXiv Detail & Related papers (2021-10-14T15:49:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.