A Simple Contrastive Framework Of Item Tokenization For Generative Recommendation
- URL: http://arxiv.org/abs/2506.16683v1
- Date: Fri, 20 Jun 2025 01:54:32 GMT
- Title: A Simple Contrastive Framework Of Item Tokenization For Generative Recommendation
- Authors: Penglong Zhai, Yifang Yuan, Fanyi Di, Jie Li, Yue Liu, Chen Li, Jie Huang, Sicong Wang, Yao Xu, Xin Li,
- Abstract summary: We propose a novel unsupervised deep quantization based on contrastive learning, named SimCIT.<n>SimCIT combines multi-modal knowledge alignment and semantic tokenization in a mutually beneficial contrastive learning framework.
- Score: 19.848402658341985
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative retrieval-based recommendation has emerged as a promising paradigm aiming at directly generating the identifiers of the target candidates. However, in large-scale recommendation systems, this approach becomes increasingly cumbersome due to the redundancy and sheer scale of the token space. To overcome these limitations, recent research has explored the use of semantic tokens as an alternative to ID tokens, which typically leveraged reconstruction-based strategies, like RQ-VAE, to quantize content embeddings and significantly reduce the embedding size. However, reconstructive quantization aims for the precise reconstruction of each item embedding independently, which conflicts with the goal of generative retrieval tasks focusing more on differentiating among items. Moreover, multi-modal side information of items, such as descriptive text and images, geographical knowledge in location-based recommendation services, has been shown to be effective in improving recommendations by providing richer contexts for interactions. Nevertheless, effectively integrating such complementary knowledge into existing generative recommendation frameworks remains challenging. To overcome these challenges, we propose a novel unsupervised deep quantization exclusively based on contrastive learning, named SimCIT (a Simple Contrastive Item Tokenization framework). Specifically, different from existing reconstruction-based strategies, SimCIT propose to use a learnable residual quantization module to align with the signals from different modalities of the items, which combines multi-modal knowledge alignment and semantic tokenization in a mutually beneficial contrastive learning framework. Extensive experiments across public datasets and a large-scale industrial dataset from various domains demonstrate SimCIT's effectiveness in LLM-based generative recommendation.
Related papers
- Breaking the Clusters: Uniformity-Optimization for Text-Based Sequential Recommendation [17.042627742322427]
Traditional sequential recommendation methods rely on explicit item IDs to capture user preferences over time.<n>Recent studies have shifted towards leveraging text-only information for recommendation.<n>We propose UniT, a framework that employs three pairwise item sampling strategies.
arXiv Detail & Related papers (2025-02-19T08:35:28Z) - Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation [51.06031200728449]
We propose a novel framework called mccHRL to provide different levels of temporal abstraction on listwise recommendation.
Within the hierarchical framework, the high-level agent studies the evolution of user perception, while the low-level agent produces the item selection policy.
Results observe significant performance improvement by our method, compared with several well-known baselines.
arXiv Detail & Related papers (2024-09-11T17:01:06Z) - Learning Multi-Aspect Item Palette: A Semantic Tokenization Framework for Generative Recommendation [55.99632509895994]
We introduce LAMIA, a novel approach for multi-aspect semantic tokenization.<n>Unlike RQ-VAE, which uses a single embedding, LAMIA learns an item palette''--a collection of independent and semantically parallel embeddings.<n>Our results demonstrate significant improvements in recommendation accuracy over existing methods.
arXiv Detail & Related papers (2024-09-11T13:49:48Z) - X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation [47.96737683498274]
Large Language Models (LLMs) and Large Multimodal Models (LMMs) have been shown to enhance the effectiveness of enriching item descriptions.
This paper introduces a novel framework, Cross-Reflection Prompting, termed X-Reflect, to address limitations by prompting LMMs to explicitly identify and reconcile supportive and conflicting information between text and images.
arXiv Detail & Related papers (2024-08-27T16:10:21Z) - MMREC: LLM Based Multi-Modal Recommender System [2.3113916776957635]
This paper presents a novel approach to enhancing recommender systems by leveraging Large Language Models (LLMs) and deep learning techniques.<n>The proposed framework aims to improve the accuracy and relevance of recommendations by incorporating multi-modal information processing and by the use of unified latent space representation.
arXiv Detail & Related papers (2024-08-08T04:31:29Z) - LLM4Rerank: LLM-based Auto-Reranking Framework for Recommendations [51.76373105981212]
Reranking is a critical component in recommender systems, playing an essential role in refining the output of recommendation algorithms.<n>We introduce a comprehensive reranking framework, designed to seamlessly integrate various reranking criteria.<n>A customizable input mechanism is also integrated, enabling the tuning of the language model's focus to meet specific reranking needs.
arXiv Detail & Related papers (2024-06-18T09:29:18Z) - TokenRec: Learning to Tokenize ID for LLM-based Generative Recommendation [16.93374578679005]
TokenRec is a novel framework for tokenizing and retrieving large-scale language models (LLMs) based Recommender Systems (RecSys)
Our strategy, Masked Vector-Quantized (MQ) Tokenizer, quantizes the masked user/item representations learned from collaborative filtering into discrete tokens.
Our generative retrieval paradigm is designed to efficiently recommend top-$K$ items for users to eliminate the need for auto-regressive decoding and beam search processes.
arXiv Detail & Related papers (2024-06-15T00:07:44Z) - Learnable Item Tokenization for Generative Recommendation [78.30417863309061]
We propose LETTER (a LEarnable Tokenizer for generaTivE Recommendation), which integrates hierarchical semantics, collaborative signals, and code assignment diversity.
LETTER incorporates Residual Quantized VAE for semantic regularization, a contrastive alignment loss for collaborative regularization, and a diversity loss to mitigate code assignment bias.
arXiv Detail & Related papers (2024-05-12T15:49:38Z) - Active Refinement for Multi-Label Learning: A Pseudo-Label Approach [84.52793080276048]
Multi-label learning (MLL) aims to associate a given instance with its relevant labels from a set of concepts.
Previous works of MLL mainly focused on the setting where the concept set is assumed to be fixed.
Many real-world applications require introducing new concepts into the set to meet new demands.
arXiv Detail & Related papers (2021-09-29T19:17:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.