Generative Recommendation with Semantic IDs: A Practitioner's Handbook
- URL: http://arxiv.org/abs/2507.22224v1
- Date: Tue, 29 Jul 2025 20:41:51 GMT
- Title: Generative Recommendation with Semantic IDs: A Practitioner's Handbook
- Authors: Clark Mingxuan Ju, Liam Collins, Leonardo Neves, Bhuvesh Kumar, Louis Yufeng Wang, Tong Zhao, Neil Shah,
- Abstract summary: Generative Recommendation (GR) has gained increasing attention for its promising performance compared to traditional models.<n>A key factor contributing to the success of GR is the semantic ID (SID), which converts continuous semantic representations into discrete ID sequences.<n>Our work introduces and open-sources a framework for Generative Recommendation with semantic ID, namely GRID, specifically designed for modularity.
- Score: 34.25784373770595
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative recommendation (GR) has gained increasing attention for its promising performance compared to traditional models. A key factor contributing to the success of GR is the semantic ID (SID), which converts continuous semantic representations (e.g., from large language models) into discrete ID sequences. This enables GR models with SIDs to both incorporate semantic information and learn collaborative filtering signals, while retaining the benefits of discrete decoding. However, varied modeling techniques, hyper-parameters, and experimental setups in existing literature make direct comparisons between GR proposals challenging. Furthermore, the absence of an open-source, unified framework hinders systematic benchmarking and extension, slowing model iteration. To address this challenge, our work introduces and open-sources a framework for Generative Recommendation with semantic ID, namely GRID, specifically designed for modularity to facilitate easy component swapping and accelerate idea iteration. Using GRID, we systematically experiment with and ablate different components of GR models with SIDs on public benchmarks. Our comprehensive experiments with GRID reveal that many overlooked architectural components in GR models with SIDs substantially impact performance. This offers both novel insights and validates the utility of an open-source platform for robust benchmarking and GR research advancement. GRID is open-sourced at https://github.com/snap-research/GRID.
Related papers
- ImpRAG: Retrieval-Augmented Generation with Implicit Queries [49.510101132093396]
ImpRAG is a query-free RAG system that integrates retrieval and generation into a unified model.<n>We show that ImpRAG achieves 3.6-11.5 improvements in exact match scores on unseen tasks with diverse formats.
arXiv Detail & Related papers (2025-06-02T21:38:21Z) - UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities [53.76854299076118]
UniversalRAG is a novel RAG framework designed to retrieve and integrate knowledge from heterogeneous sources with diverse modalities and granularities.<n>We propose a modality-aware routing mechanism that dynamically identifies the most appropriate modality-specific corpus and performs targeted retrieval within it.<n>We validate UniversalRAG on 8 benchmarks spanning multiple modalities, showing its superiority over various modality-specific and unified baselines.
arXiv Detail & Related papers (2025-04-29T13:18:58Z) - Replication and Exploration of Generative Retrieval over Dynamic Corpora [87.09185685594105]
Generative retrieval (GR) has emerged as a promising paradigm in information retrieval (IR)<n>We show that existing GR models with numericittext-based docids show superior generalization to unseen documents.<n>We propose a novel multi-docid design that leverages both the efficiency of numeric-based docids and the effectiveness of text-based docids.
arXiv Detail & Related papers (2025-04-24T13:01:23Z) - SemCORE: A Semantic-Enhanced Generative Cross-Modal Retrieval Framework with MLLMs [70.79124435220695]
We propose a novel unified Semantic-enhanced generative Cross-mOdal REtrieval framework (SemCORE)<n>We first construct a Structured natural language IDentifier (SID) that effectively aligns target identifiers with generative models optimized for natural language comprehension and generation.<n>We then introduce a Generative Semantic Verification (GSV) strategy enabling fine-grained target discrimination.
arXiv Detail & Related papers (2025-04-17T17:59:27Z) - Beyond Unimodal Boundaries: Generative Recommendation with Multimodal Semantics [46.79459036259515]
We argue that this is a significant limitation given the rich, multimodal nature of real-world data.<n>We reveal that GR models are particularly sensitive to different modalities and examine the challenges in achieving effective GR.<n>We introduce MGR-LF++, an enhanced late fusion framework that employs contrastive modality alignment and special tokens to denote different modalities.
arXiv Detail & Related papers (2025-03-30T06:24:43Z) - TrustRAG: An Information Assistant with Retrieval Augmented Generation [73.84864898280719]
TrustRAG is a novel framework that enhances acRAG from three perspectives: indexing, retrieval, and generation.<n>We open-source the TrustRAG framework and provide a demonstration studio designed for excerpt-based question answering tasks.
arXiv Detail & Related papers (2025-02-19T13:45:27Z) - Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models [23.68266151581951]
Retrieval-Augmented Generation (RAG) has been shown to enhance the factual accuracy of Large Language Models (LLMs)
Existing methods often suffer from limited reasoning capabilities in effectively using the retrieved evidence.
We introduce a novel framework, Open-RAG, designed to enhance reasoning capabilities in RAG with open-source LLMs.
arXiv Detail & Related papers (2024-10-02T17:37:18Z) - FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research [70.6584488911715]
retrieval-augmented generation (RAG) has attracted considerable research attention.<n>Existing RAG toolkits are often heavy and inflexibly, failing to meet the customization needs of researchers.<n>Our toolkit has implemented 16 advanced RAG methods and gathered and organized 38 benchmark datasets.
arXiv Detail & Related papers (2024-05-22T12:12:40Z) - CorpusLM: Towards a Unified Language Model on Corpus for Knowledge-Intensive Tasks [20.390672895839757]
Retrieval-augmented generation (RAG) has emerged as a popular solution to enhance factual accuracy.
Traditional retrieval modules often rely on large document index and disconnect with generative tasks.
We propose textbfCorpusLM, a unified language model that integrates generative retrieval, closed-book generation, and RAG.
arXiv Detail & Related papers (2024-02-02T06:44:22Z) - Gait Recognition in the Wild: A Large-scale Benchmark and NAS-based
Baseline [95.88825497452716]
Gait benchmarks empower the research community to train and evaluate high-performance gait recognition systems.
GREW is the first large-scale dataset for gait recognition in the wild.
SPOSGait is the first NAS-based gait recognition model.
arXiv Detail & Related papers (2022-05-05T14:57:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.