A Simple Contrastive Framework Of Item Tokenization For Generative Recommendation
- URL: http://arxiv.org/abs/2506.16683v1
- Date: Fri, 20 Jun 2025 01:54:32 GMT
- Title: A Simple Contrastive Framework Of Item Tokenization For Generative Recommendation
- Authors: Penglong Zhai, Yifang Yuan, Fanyi Di, Jie Li, Yue Liu, Chen Li, Jie Huang, Sicong Wang, Yao Xu, Xin Li,
- Abstract summary: We propose a novel unsupervised deep quantization based on contrastive learning, named SimCIT.<n>SimCIT combines multi-modal knowledge alignment and semantic tokenization in a mutually beneficial contrastive learning framework.
- Score: 19.848402658341985
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative retrieval-based recommendation has emerged as a promising paradigm aiming at directly generating the identifiers of the target candidates. However, in large-scale recommendation systems, this approach becomes increasingly cumbersome due to the redundancy and sheer scale of the token space. To overcome these limitations, recent research has explored the use of semantic tokens as an alternative to ID tokens, which typically leveraged reconstruction-based strategies, like RQ-VAE, to quantize content embeddings and significantly reduce the embedding size. However, reconstructive quantization aims for the precise reconstruction of each item embedding independently, which conflicts with the goal of generative retrieval tasks focusing more on differentiating among items. Moreover, multi-modal side information of items, such as descriptive text and images, geographical knowledge in location-based recommendation services, has been shown to be effective in improving recommendations by providing richer contexts for interactions. Nevertheless, effectively integrating such complementary knowledge into existing generative recommendation frameworks remains challenging. To overcome these challenges, we propose a novel unsupervised deep quantization exclusively based on contrastive learning, named SimCIT (a Simple Contrastive Item Tokenization framework). Specifically, different from existing reconstruction-based strategies, SimCIT propose to use a learnable residual quantization module to align with the signals from different modalities of the items, which combines multi-modal knowledge alignment and semantic tokenization in a mutually beneficial contrastive learning framework. Extensive experiments across public datasets and a large-scale industrial dataset from various domains demonstrate SimCIT's effectiveness in LLM-based generative recommendation.
Related papers
- From Sparse Decisions to Dense Reasoning: A Multi-attribute Trajectory Paradigm for Multimodal Moderation [59.27094165576015]
We propose a novel learning paradigm (UniMod) that transitions from sparse decision-making to dense reasoning traces.<n>By constructing structured trajectories encompassing evidence grounding, modality assessment, risk mapping, policy decision, and response generation, we reformulate monolithic decision tasks into a multi-dimensional boundary learning process.<n>We introduce specialized optimization strategies to decouple task-specific parameters and rebalance training dynamics, effectively resolving interference between diverse objectives in multi-task learning.
arXiv Detail & Related papers (2026-01-28T09:29:40Z) - Multi-hop Reasoning via Early Knowledge Alignment [68.28168992785896]
Early Knowledge Alignment (EKA) aims to align Large Language Models with contextually relevant retrieved knowledge.<n>EKA significantly improves retrieval precision, reduces cascading errors, and enhances both performance and efficiency.<n>EKA proves effective as a versatile, training-free inference strategy that scales seamlessly to large models.
arXiv Detail & Related papers (2025-12-23T08:14:44Z) - Multi-Aspect Cross-modal Quantization for Generative Recommendation [27.92632297542123]
We propose Multi-Aspect Cross-modal quantization for generative Recommendation (MACRec)<n>We first introduce cross-modal quantization during the ID learning process, which effectively reduces conflict rates.<n>We also incorporate multi-aspect cross-modal alignments, including the implicit and explicit alignments.
arXiv Detail & Related papers (2025-11-19T04:55:14Z) - Revisiting scalable sequential recommendation with Multi-Embedding Approach and Mixture-of-Experts [15.976682531132676]
We propose Fuxi-MME, a framework that integrates a multi-embedding strategy with a Mixture-of-Experts (MoE) architecture.<n>Specifically, to efficiently capture diverse item characteristics in a decoupled manner, we decompose the conventional single embedding matrix into several lower-dimensional embedding matrices.
arXiv Detail & Related papers (2025-10-29T08:42:15Z) - Towards Context-aware Reasoning-enhanced Generative Searching in E-commerce [61.03081096959132]
We propose a context-aware reasoning-enhanced generative search framework for better textbfunderstanding the complicated context.<n>Our approach achieves superior performance compared with strong baselines, validating its effectiveness for search-based recommendation.
arXiv Detail & Related papers (2025-10-19T16:46:11Z) - Multimodal Representation-disentangled Information Bottleneck for Multimodal Recommendation [36.338586087343806]
We propose a novel framework, the Multimodal Representation-disentangled Information Bottleneck (MRdIB)<n>Concretely, we first employ a Multimodal Information Bottleneck to compress the input representations.<n>Then, we decompose the information based on its relationship with the recommendation target into unique, redundant, and synergistic components.
arXiv Detail & Related papers (2025-09-24T15:18:32Z) - Breaking the Clusters: Uniformity-Optimization for Text-Based Sequential Recommendation [17.042627742322427]
Traditional sequential recommendation methods rely on explicit item IDs to capture user preferences over time.<n>Recent studies have shifted towards leveraging text-only information for recommendation.<n>We propose UniT, a framework that employs three pairwise item sampling strategies.
arXiv Detail & Related papers (2025-02-19T08:35:28Z) - Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation [51.06031200728449]
We propose a novel framework called mccHRL to provide different levels of temporal abstraction on listwise recommendation.
Within the hierarchical framework, the high-level agent studies the evolution of user perception, while the low-level agent produces the item selection policy.
Results observe significant performance improvement by our method, compared with several well-known baselines.
arXiv Detail & Related papers (2024-09-11T17:01:06Z) - Learning Multi-Aspect Item Palette: A Semantic Tokenization Framework for Generative Recommendation [55.99632509895994]
We introduce LAMIA, a novel approach for multi-aspect semantic tokenization.<n>Unlike RQ-VAE, which uses a single embedding, LAMIA learns an item palette''--a collection of independent and semantically parallel embeddings.<n>Our results demonstrate significant improvements in recommendation accuracy over existing methods.
arXiv Detail & Related papers (2024-09-11T13:49:48Z) - X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation [47.96737683498274]
Large Language Models (LLMs) and Large Multimodal Models (LMMs) have been shown to enhance the effectiveness of enriching item descriptions.
This paper introduces a novel framework, Cross-Reflection Prompting, termed X-Reflect, to address limitations by prompting LMMs to explicitly identify and reconcile supportive and conflicting information between text and images.
arXiv Detail & Related papers (2024-08-27T16:10:21Z) - MMREC: LLM Based Multi-Modal Recommender System [2.3113916776957635]
This paper presents a novel approach to enhancing recommender systems by leveraging Large Language Models (LLMs) and deep learning techniques.<n>The proposed framework aims to improve the accuracy and relevance of recommendations by incorporating multi-modal information processing and by the use of unified latent space representation.
arXiv Detail & Related papers (2024-08-08T04:31:29Z) - LLM4Rerank: LLM-based Auto-Reranking Framework for Recommendations [51.76373105981212]
Reranking is a critical component in recommender systems, playing an essential role in refining the output of recommendation algorithms.<n>We introduce a comprehensive reranking framework, designed to seamlessly integrate various reranking criteria.<n>A customizable input mechanism is also integrated, enabling the tuning of the language model's focus to meet specific reranking needs.
arXiv Detail & Related papers (2024-06-18T09:29:18Z) - TokenRec: Learning to Tokenize ID for LLM-based Generative Recommendation [16.93374578679005]
TokenRec is a novel framework for tokenizing and retrieving large-scale language models (LLMs) based Recommender Systems (RecSys)
Our strategy, Masked Vector-Quantized (MQ) Tokenizer, quantizes the masked user/item representations learned from collaborative filtering into discrete tokens.
Our generative retrieval paradigm is designed to efficiently recommend top-$K$ items for users to eliminate the need for auto-regressive decoding and beam search processes.
arXiv Detail & Related papers (2024-06-15T00:07:44Z) - Learnable Item Tokenization for Generative Recommendation [78.30417863309061]
We propose LETTER (a LEarnable Tokenizer for generaTivE Recommendation), which integrates hierarchical semantics, collaborative signals, and code assignment diversity.
LETTER incorporates Residual Quantized VAE for semantic regularization, a contrastive alignment loss for collaborative regularization, and a diversity loss to mitigate code assignment bias.
arXiv Detail & Related papers (2024-05-12T15:49:38Z) - Active Refinement for Multi-Label Learning: A Pseudo-Label Approach [84.52793080276048]
Multi-label learning (MLL) aims to associate a given instance with its relevant labels from a set of concepts.
Previous works of MLL mainly focused on the setting where the concept set is assumed to be fixed.
Many real-world applications require introducing new concepts into the set to meet new demands.
arXiv Detail & Related papers (2021-09-29T19:17:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.