Related papers: EAGER: Two-Stream Generative Recommender with Behavior-Semantic Collaboration

EAGER: Two-Stream Generative Recommender with Behavior-Semantic Collaboration

URL: http://arxiv.org/abs/2406.14017v2
Date: Wed, 03 Jul 2024 10:00:26 GMT
Title: EAGER: Two-Stream Generative Recommender with Behavior-Semantic Collaboration
Authors: Ye Wang, Jiahao Xun, Minjie Hong, Jieming Zhu, Tao Jin, Wang Lin, Haoyuan Li, Linjun Li, Yan Xia, Zhou Zhao, Zhenhua Dong,
Abstract summary: We introduce EAGER, a novel generative recommendation framework that seamlessly integrates both behavioral and semantic information. We validate the effectiveness of EAGER on four public benchmarks, demonstrating its superior performance compared to existing methods.
Score: 63.112790050749695
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative retrieval has recently emerged as a promising approach to sequential recommendation, framing candidate item retrieval as an autoregressive sequence generation problem. However, existing generative methods typically focus solely on either behavioral or semantic aspects of item information, neglecting their complementary nature and thus resulting in limited effectiveness. To address this limitation, we introduce EAGER, a novel generative recommendation framework that seamlessly integrates both behavioral and semantic information. Specifically, we identify three key challenges in combining these two types of information: a unified generative architecture capable of handling two feature types, ensuring sufficient and independent learning for each type, and fostering subtle interactions that enhance collaborative information utilization. To achieve these goals, we propose (1) a two-stream generation architecture leveraging a shared encoder and two separate decoders to decode behavior tokens and semantic tokens with a confidence-based ranking strategy; (2) a global contrastive task with summary tokens to achieve discriminative decoding for each type of information; and (3) a semantic-guided transfer task designed to implicitly promote cross-interactions through reconstruction and estimation objectives. We validate the effectiveness of EAGER on four public benchmarks, demonstrating its superior performance compared to existing methods.

Related papers

LLaDA-Rec: Discrete Diffusion for Parallel Semantic ID Generation in Generative Recommendation [32.284624021041004]
We propose LLaDA-Rec, a discrete diffusion framework that reformulates recommendation as parallel semantic ID generation.<n> Experiments on three real-world datasets show that LLaDA-Rec consistently outperforms both ID-based and state-of-the-art generative recommenders.
arXiv Detail & Related papers (2025-11-09T07:12:15Z)
DiscRec: Disentangled Semantic-Collaborative Modeling for Generative Recommendation [33.152693125551785]
Generative recommendation is emerging as a powerful paradigm that directly generates item predictions.<n>Current methods face two key challenges: token-item misalignment and semantic-collaborative signal entanglement.<n>We propose DiscRec, a novel framework that enables Disentangled Semantic-Collaborative signal modeling.
arXiv Detail & Related papers (2025-06-18T15:53:47Z)
Unifying Search and Recommendation: A Generative Paradigm Inspired by Information Theory [25.70711328738117]
GenSR is a novel generative paradigm for unifying search and recommendation. Our work introduces a new generative paradigm compared with previous discriminative methods.
arXiv Detail & Related papers (2025-04-09T09:15:37Z)
Unified Generative Search and Recommendation [14.317849340141919]
We introduce GenSAR, a unified generative framework for balanced search and recommendation. Our approach designs dual-purpose identifiers and tailored training strategies to incorporate complementary signals and align with task-specific objectives. Experiments on both public and commercial datasets demonstrate that GenSAR effectively reduces the trade-off and achieves state-of-the-art performance on both tasks.
arXiv Detail & Related papers (2025-04-08T07:03:08Z)
Universal Item Tokenization for Transferable Generative Recommendation [89.42584009980676]
We propose UTGRec, a universal item tokenization approach for transferable Generative Recommendation. By devising tree-structured codebooks, we discretize content representations into corresponding codes for item tokenization. For raw content reconstruction, we employ dual lightweight decoders to reconstruct item text and images from discrete representations. For collaborative knowledge integration, we assume that co-occurring items are similar and integrate collaborative signals through co-occurrence alignment and reconstruction.
arXiv Detail & Related papers (2025-04-06T08:07:49Z)
Progressive Collaborative and Semantic Knowledge Fusion for Generative Recommendation [36.48113842751375]
We propose a progressive collaborative and semantic knowledge fusion model for generative recommendation, named PRORec. In the first stage, we propose a cross-modality knowledge alignment task, which integrates semantic knowledge into collaborative embeddings. In the second stage, we propose an in-modality knowledge distillation task, designed to effectively capture and integrate knowledge from both semantic and collaborative modalities.
arXiv Detail & Related papers (2025-02-10T09:08:37Z)
Generative Retrieval Meets Multi-Graded Relevance [104.75244721442756]
We introduce a framework called GRaded Generative Retrieval (GR$2$) GR$2$ focuses on two key components: ensuring relevant and distinct identifiers, and implementing multi-graded constrained contrastive training. Experiments on datasets with both multi-graded and binary relevance demonstrate the effectiveness of GR$2$.
arXiv Detail & Related papers (2024-09-27T02:55:53Z)
CART: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling [53.97609687516371]
Cross-modal retrieval aims to search for instances, which are semantically related to the query through the interaction of different modal data.<n>Traditional solutions utilize a single-tower or dual-tower framework to explicitly compute the score between queries and candidates.<n>We propose a generative cross-modal retrieval framework (CART) based on coarse-to-fine semantic modeling.
arXiv Detail & Related papers (2024-06-25T12:47:04Z)
Learnable Item Tokenization for Generative Recommendation [78.30417863309061]
We propose LETTER (a LEarnable Tokenizer for generaTivE Recommendation), which integrates hierarchical semantics, collaborative signals, and code assignment diversity. LETTER incorporates Residual Quantized VAE for semantic regularization, a contrastive alignment loss for collaborative regularization, and a diversity loss to mitigate code assignment bias.
arXiv Detail & Related papers (2024-05-12T15:49:38Z)
Feature Decoupling-Recycling Network for Fast Interactive Segmentation [79.22497777645806]
Recent interactive segmentation methods iteratively take source image, user guidance and previously predicted mask as the input. We propose the Feature Decoupling-Recycling Network (FDRN), which decouples the modeling components based on their intrinsic discrepancies.
arXiv Detail & Related papers (2023-08-07T12:26:34Z)
RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection [32.20132357830726]
Language-Image Pre-training (LIPR) is a strategy for contrastive pre-training that leverages both entity and relation descriptions. We show the benefits of these contributions, collectively termed RLIP-ParSe, for improved zero-shot, few-shot and fine-tuning HOI detection as well as increased robustness from noisy annotations.
arXiv Detail & Related papers (2022-09-05T07:50:54Z)
CTRN: Class-Temporal Relational Network for Action Detection [7.616556723260849]
We introduce an end-to-end network: Class-Temporal Network (CTRN) CTRN contains three key components: The Transform Representation Module, the Class-Temporal Module and the G-classifier. We evaluate CTR on three densely labelled datasets and achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-10-26T08:15:47Z)
Joint Inductive and Transductive Learning for Video Object Segmentation [107.32760625159301]
Semi-supervised object segmentation is a task of segmenting the target object in a video sequence given only a mask in the first frame. Most previous best-performing methods adopt matching-based transductive reasoning or online inductive learning. We propose to integrate transductive and inductive learning into a unified framework to exploit complement between them for accurate and robust video object segmentation.
arXiv Detail & Related papers (2021-08-08T16:25:48Z)
Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding. At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network. With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.