IDGenRec: LLM-RecSys Alignment with Textual ID Learning
- URL: http://arxiv.org/abs/2403.19021v2
- Date: Fri, 17 May 2024 04:05:18 GMT
- Title: IDGenRec: LLM-RecSys Alignment with Textual ID Learning
- Authors: Juntao Tan, Shuyuan Xu, Wenyue Hua, Yingqiang Ge, Zelong Li, Yongfeng Zhang,
- Abstract summary: We propose IDGen, representing each item as a unique, concise, semantically rich, platform-agnostic textual ID.
We show that IDGen consistently surpasses existing models in sequential recommendation under standard experimental setting.
Results show that the zero-shot performance of the pre-trained foundation model is comparable to or even better than some traditional recommendation models.
- Score: 48.018397048791115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative recommendation based on Large Language Models (LLMs) have transformed the traditional ranking-based recommendation style into a text-to-text generation paradigm. However, in contrast to standard NLP tasks that inherently operate on human vocabulary, current research in generative recommendations struggles to effectively encode recommendation items within the text-to-text framework using concise yet meaningful ID representations. To better align LLMs with recommendation needs, we propose IDGen, representing each item as a unique, concise, semantically rich, platform-agnostic textual ID using human language tokens. This is achieved by training a textual ID generator alongside the LLM-based recommender, enabling seamless integration of personalized recommendations into natural language generation. Notably, as user history is expressed in natural language and decoupled from the original dataset, our approach suggests the potential for a foundational generative recommendation model. Experiments show that our framework consistently surpasses existing models in sequential recommendation under standard experimental setting. Then, we explore the possibility of training a foundation recommendation model with the proposed method on data collected from 19 different datasets and tested its recommendation performance on 6 unseen datasets across different platforms under a completely zero-shot setting. The results show that the zero-shot performance of the pre-trained foundation model is comparable to or even better than some traditional recommendation models based on supervised training, showing the potential of the IDGen paradigm serving as the foundation model for generative recommendation. Code and data are open-sourced at https://github.com/agiresearch/IDGenRec.
Related papers
- Beyond Retrieval: Generating Narratives in Conversational Recommender Systems [4.912663905306209]
We introduce a new dataset (REGEN) for natural language generation tasks in conversational recommendations.
We establish benchmarks using well-known generative metrics, and perform an automated evaluation of the new dataset using a rater LLM.
And to the best of our knowledge, represents the first attempt to analyze the capabilities of LLMs in understanding recommender signals and generating rich narratives.
arXiv Detail & Related papers (2024-10-22T07:53:41Z) - STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM [59.08493154172207]
We propose a unified framework to streamline the semantic tokenization and generative recommendation process.
We formulate semantic tokenization as a text-to-token task and generative recommendation as a token-to-token task, supplemented by a token-to-text reconstruction task and a text-to-token auxiliary task.
All these tasks are framed in a generative manner and trained using a single large language model (LLM) backbone.
arXiv Detail & Related papers (2024-09-11T13:49:48Z) - GenRec: Generative Sequential Recommendation with Large Language Models [4.381277509913139]
We propose a novel model named Generative Recommendation (GenRec)
GenRec is lightweight and requires only a few hours to train effectively in low-resource settings.
Our experiments have demonstrated that GenRec generalizes on various public real-world datasets.
arXiv Detail & Related papers (2024-07-30T20:58:36Z) - LlamaRec: Two-Stage Recommendation using Large Language Models for
Ranking [10.671747198171136]
We propose a two-stage framework using large language models for ranking-based recommendation (LlamaRec)
In particular, we use small-scale sequential recommenders to retrieve candidates based on the user interaction history.
LlamaRec consistently achieves datasets superior performance in both recommendation performance and efficiency.
arXiv Detail & Related papers (2023-10-25T06:23:48Z) - ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation [43.270424225285105]
We focus on adapting and empowering a pure large language model for zero-shot and few-shot recommendation tasks.
We propose Retrieval-enhanced Large Language models (ReLLa) for recommendation tasks in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-08-22T02:25:04Z) - GenRec: Large Language Model for Generative Recommendation [41.22833600362077]
This paper presents an innovative approach to recommendation systems using large language models (LLMs) based on text data.
GenRec uses LLM's understanding ability to interpret context, learn user preferences, and generate relevant recommendation.
Our research underscores the potential of LLM-based generative recommendation in revolutionizing the domain of recommendation systems.
arXiv Detail & Related papers (2023-07-02T02:37:07Z) - Recommendation as Instruction Following: A Large Language Model
Empowered Recommendation Approach [83.62750225073341]
We consider recommendation as instruction following by large language models (LLMs)
We first design a general instruction format for describing the preference, intention, task form and context of a user in natural language.
Then we manually design 39 instruction templates and automatically generate a large amount of user-personalized instruction data.
arXiv Detail & Related papers (2023-05-11T17:39:07Z) - How to Index Item IDs for Recommendation Foundation Models [49.425959632372425]
Recommendation foundation model utilizes large language models (LLM) for recommendation by converting recommendation tasks into natural language tasks.
To avoid generating excessively long text and hallucinated recommendations, creating LLM-compatible item IDs is essential.
We propose four simple yet effective solutions, including sequential indexing, collaborative indexing, semantic (content-based) indexing, and hybrid indexing.
arXiv Detail & Related papers (2023-05-11T05:02:37Z) - Recommender Systems with Generative Retrieval [58.454606442670034]
We propose a novel generative retrieval approach, where the retrieval model autoregressively decodes the identifiers of the target candidates.
To that end, we create semantically meaningful of codewords to serve as a Semantic ID for each item.
We show that recommender systems trained with the proposed paradigm significantly outperform the current SOTA models on various datasets.
arXiv Detail & Related papers (2023-05-08T21:48:17Z) - GEMv2: Multilingual NLG Benchmarking in a Single Line of Code [161.1761414080574]
Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers.
GEMv2 supports 40 documented datasets in 51 languages.
Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.
arXiv Detail & Related papers (2022-06-22T17:52:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.