Related papers: UNGER: Generative Recommendation with A Unified Code via Semantic and Collaborative Integration

UNGER: Generative Recommendation with A Unified Code via Semantic and Collaborative Integration

URL: http://arxiv.org/abs/2502.06269v2
Date: Wed, 23 Jul 2025 08:15:44 GMT
Title: UNGER: Generative Recommendation with A Unified Code via Semantic and Collaborative Integration
Authors: Longtao Xiao, Haozhao Wang, Cheng Wang, Linfei Ji, Yifan Wang, Jieming Zhu, Zhenhua Dong, Rui Zhang, Ruixuan Li,
Abstract summary: We propose a novel method, named UNGER, which integrates semantic and collaborative knowledge into a unified code for generative recommendation.<n>To mitigate the information loss caused by the quantization process, we introduce an intra-modality knowledge distillation task.
Score: 36.48113842751375
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rise of generative paradigms, generative recommendation has garnered increasing attention. The core component is the item code, generally derived by quantizing collaborative or semantic representations to serve as candidate items identifiers in the context. However, existing methods typically construct separate codes for each modality, leading to higher computational and storage costs and hindering the integration of their complementary strengths. Considering this limitation, we seek to integrate two different modalities into a unified code, fully unleashing the potential of complementary nature among modalities. Nevertheless, the integration remains challenging: the integrated embedding obtained by the common concatenation method would lead to underutilization of collaborative knowledge, thereby resulting in limited effectiveness. To address this, we propose a novel method, named UNGER, which integrates semantic and collaborative knowledge into a unified code for generative recommendation. Specifically, we propose to adaptively learn an integrated embedding through the joint optimization of cross-modality knowledge alignment and next item prediction tasks. Subsequently, to mitigate the information loss caused by the quantization process, we introduce an intra-modality knowledge distillation task, using the integrated embeddings as supervised signals to compensate. Extensive experiments on three widely used benchmarks demonstrate the superiority of our approach compared to existing methods.

Related papers

Synergistic Integration and Discrepancy Resolution of Contextualized Knowledge for Personalized Recommendation [16.83733237411492]
We introduce CoCo, an end-to-end framework that dynamically constructs user-specific contextual knowledge embeddings.<n>Our method realizes profound integration of semantic and behavioral latent dimensions via adaptive knowledge fusion and contradiction resolution modules.<n>With its modular design and model-agnostic architecture, CoCo provides a versatile solution for next-generation recommendation systems.
arXiv Detail & Related papers (2025-10-16T03:16:21Z)
A Simple Contrastive Framework Of Item Tokenization For Generative Recommendation [19.848402658341985]
We propose a novel unsupervised deep quantization based on contrastive learning, named SimCIT.<n>SimCIT combines multi-modal knowledge alignment and semantic tokenization in a mutually beneficial contrastive learning framework.
arXiv Detail & Related papers (2025-06-20T01:54:32Z)
COHESION: Composite Graph Convolutional Network with Dual-Stage Fusion for Multimodal Recommendation [26.169114011402232]
Two key processes in multimodal recommendations are modality fusion and representation learning. We introduce a COmposite grapH convolutional nEtwork with dual-stage fuSION for the multimodal recommendation, named COHESION.
arXiv Detail & Related papers (2025-04-06T11:42:49Z)
Universal Item Tokenization for Transferable Generative Recommendation [89.42584009980676]
We propose UTGRec, a universal item tokenization approach for transferable Generative Recommendation.<n>By devising tree-structured codebooks, we discretize content representations into corresponding codes for item tokenization.<n>For raw content reconstruction, we employ dual lightweight decoders to reconstruct item text and images from discrete representations.<n>For collaborative knowledge integration, we assume that co-occurring items are similar and integrate collaborative signals through co-occurrence alignment and reconstruction.
arXiv Detail & Related papers (2025-04-06T08:07:49Z)
Continual Cross-Modal Generalization [48.56694158680082]
Cross-modal generalization aims to learn a shared representation space from multimodal pairs.<n>We propose a continual learning approach that incrementally maps new modalities into a shared codebook via a mediator modality.<n>Experiments on image-text, audio-text, video-text, and speech-text show that our method achieves strong performance on various cross-modal generalization tasks.
arXiv Detail & Related papers (2025-04-01T09:16:20Z)
Bridging Textual-Collaborative Gap through Semantic Codes for Sequential Recommendation [91.13055384151897]
CoCoRec is a novel Code-based textual and Collaborative semantic fusion method for sequential Recommendation. We generate fine-grained semantic codes from multi-view text embeddings through vector quantization techniques. In order to further enhance the fusion of textual and collaborative semantics, we introduce an optimization strategy.
arXiv Detail & Related papers (2025-03-15T15:54:44Z)
Unity in Diversity: Multi-expert Knowledge Confrontation and Collaboration for Generalizable Vehicle Re-identification [60.20318058777603]
Generalizable vehicle re-identification (ReID) seeks to develop models that can adapt to unknown target domains without the need for fine-tuning or retraining.<n>Previous works have mainly focused on extracting domain-invariant features by aligning data distributions between source domains.<n>We propose a two-stage Multi-expert Knowledge Confrontation and Collaboration (MiKeCoCo) method to solve this unique problem.
arXiv Detail & Related papers (2024-07-10T04:06:39Z)
EAGER: Two-Stream Generative Recommender with Behavior-Semantic Collaboration [63.112790050749695]
We introduce EAGER, a novel generative recommendation framework that seamlessly integrates both behavioral and semantic information. We validate the effectiveness of EAGER on four public benchmarks, demonstrating its superior performance compared to existing methods.
arXiv Detail & Related papers (2024-06-20T06:21:56Z)
Concept Matching with Agent for Out-of-Distribution Detection [19.407364109506904]
We propose a new method that integrates the agent paradigm into out-of-distribution (OOD) detection task.<n>Our proposed method, Concept Matching with Agent (CMA), employs neutral prompts as agents to augment the CLIP-based OOD detection process.<n>Our extensive experimental results showcase the superior performance of CMA over both zero-shot and training-required methods.
arXiv Detail & Related papers (2024-05-27T02:27:28Z)
Learnable Item Tokenization for Generative Recommendation [78.30417863309061]
We propose LETTER (a LEarnable Tokenizer for generaTivE Recommendation), which integrates hierarchical semantics, collaborative signals, and code assignment diversity. LETTER incorporates Residual Quantized VAE for semantic regularization, a contrastive alignment loss for collaborative regularization, and a diversity loss to mitigate code assignment bias.
arXiv Detail & Related papers (2024-05-12T15:49:38Z)
LEARN: Knowledge Adaptation from Large Language Model to Recommendation for Practical Industrial Application [54.984348122105516]
Llm-driven knowlEdge Adaptive RecommeNdation (LEARN) framework synergizes open-world knowledge with collaborative knowledge.<n>We propose an Llm-driven knowlEdge Adaptive RecommeNdation (LEARN) framework that synergizes open-world knowledge with collaborative knowledge.
arXiv Detail & Related papers (2024-05-07T04:00:30Z)
Co-guiding for Multi-intent Spoken Language Understanding [53.30511968323911]
We propose a novel model termed Co-guiding Net, which implements a two-stage framework achieving the mutual guidances between the two tasks. For the first stage, we propose single-task supervised contrastive learning, and for the second stage, we propose co-guiding supervised contrastive learning. Experiment results on multi-intent SLU show that our model outperforms existing models by a large margin.
arXiv Detail & Related papers (2023-11-22T08:06:22Z)
WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation [37.53063869243558]
We build Win-win Cooperation (WiCo) to exploit complementary nature of two types of methods on both interaction and integration aspects. With our WiCo, several prominent top-down and bottom-up combinations achieve remarkable improvements on three common datasets with reasonable extra costs.
arXiv Detail & Related papers (2023-06-19T07:49:29Z)
REKnow: Enhanced Knowledge for Joint Entity and Relation Extraction [30.829001748700637]
Relation extraction is a challenging task that aims to extract all hidden relational facts from the text. There is no unified framework that works well under various relation extraction settings. We propose a knowledge-enhanced generative model to mitigate these two issues. Our model achieves superior performance on multiple benchmarks and settings, including WebNLG, NYT10, and TACRED.
arXiv Detail & Related papers (2022-06-10T13:59:38Z)
Human-Centered Prior-Guided and Task-Dependent Multi-Task Representation Learning for Action Recognition Pre-Training [8.571437792425417]
We propose a novel action recognition pre-training framework, which exploits human-centered prior knowledge that generates more informative representation. Specifically, we distill knowledge from a human parsing model to enrich the semantic capability of representation. In addition, we combine knowledge distillation with contrastive learning to constitute a task-dependent multi-task framework.
arXiv Detail & Related papers (2022-04-27T06:51:31Z)
PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval [87.68667887072324]
We propose a novel approach that leverages query-centric and PAssage-centric sImilarity Relations (called PAIR) for dense passage retrieval. To implement our approach, we make three major technical contributions by introducing formal formulations of the two kinds of similarity relations. Our approach significantly outperforms previous state-of-the-art models on both MSMARCO and Natural Questions datasets.
arXiv Detail & Related papers (2021-08-13T02:07:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.