DiffGRM: Diffusion-based Generative Recommendation Model
- URL: http://arxiv.org/abs/2510.21805v1
- Date: Tue, 21 Oct 2025 03:23:32 GMT
- Title: DiffGRM: Diffusion-based Generative Recommendation Model
- Authors: Zhao Liu, Yichen Zhu, Yiqing Yang, Guoping Tang, Rui Huang, Qiang Luo, Xiao Lv, Ruiming Tang, Kun Gai, Guorui Zhou,
- Abstract summary: Generative recommendation (GR) is an emerging paradigm that represents each item via a tokenizer as an n-digit semantic ID (SID)<n>We propose DiffGRM, a diffusion-based GR model that replaces the autoregressive decoder with a masked discrete diffusion model (MDM)<n> Experiments show consistent gains over strong generative and discriminative recommendation baselines on multiple datasets.
- Score: 63.35379395455103
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative recommendation (GR) is an emerging paradigm that represents each item via a tokenizer as an n-digit semantic ID (SID) and predicts the next item by autoregressively generating its SID conditioned on the user's history. However, two structural properties of SIDs make ARMs ill-suited. First, intra-item consistency: the n digits jointly specify one item, yet the left-to-right causality trains each digit only under its prefix and blocks bidirectional cross-digit evidence, collapsing supervision to a single causal path. Second, inter-digit heterogeneity: digits differ in semantic granularity and predictability, while the uniform next-token objective assigns equal weight to all digits, overtraining easy digits and undertraining hard digits. To address these two issues, we propose DiffGRM, a diffusion-based GR model that replaces the autoregressive decoder with a masked discrete diffusion model (MDM), thereby enabling bidirectional context and any-order parallel generation of SID digits for recommendation. Specifically, we tailor DiffGRM in three aspects: (1) tokenization with Parallel Semantic Encoding (PSE) to decouple digits and balance per-digit information; (2) training with On-policy Coherent Noising (OCN) that prioritizes uncertain digits via coherent masking to concentrate supervision on high-value signals; and (3) inference with Confidence-guided Parallel Denoising (CPD) that fills higher-confidence digits first and generates diverse Top-K candidates. Experiments show consistent gains over strong generative and discriminative recommendation baselines on multiple datasets, improving NDCG@10 by 6.9%-15.5%. Code is available at https://github.com/liuzhao09/DiffGRM.
Related papers
- Adaptation to Intrinsic Dependence in Diffusion Language Models [5.185131234265025]
Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive (AR) approaches.<n>We introduce a distribution-agnostic unmasking schedule for DLMs that adapts to the (unknown) dependence structure of the target data distribution.<n>Our results significantly improve upon prior convergence theories and yield substantial sampling acceleration for low-complexity distributions.
arXiv Detail & Related papers (2026-02-23T18:41:34Z) - Masked Diffusion Generative Recommendation [14.679550929790151]
Generative recommendation (GR) typically first quantizes continuous item embeddings into multi-level semantic IDs (SIDs)<n>We propose MDGR, a Masked Diffusion Generative Recommendation framework that reshapes the GR pipeline from three perspectives: codebook, training, and inference.
arXiv Detail & Related papers (2026-01-27T11:39:02Z) - The Best of the Two Worlds: Harmonizing Semantic and Hash IDs for Sequential Recommendation [51.62815306481903]
We propose textbfname, a novel framework that harmonizes the SID and HID. Specifically, we devise a dual-branch modeling architecture that enables the model to capture both the multi-granular semantics within SID while preserving the unique collaborative identity of HID.<n>Experiments on three real-world datasets show that name balances recommendation quality for both head and tail items while surpassing the existing baselines.
arXiv Detail & Related papers (2025-12-11T07:50:53Z) - Masked Diffusion for Generative Recommendation [30.8737219110446]
Generative recommendation (GR) with semantic IDs (SIDs) has emerged as a promising alternative to traditional recommendation approaches.<n>We propose to instead model and learn the probability of a user's sequence of SIDs using masked diffusion.<n>We demonstrate through thorough experiments that our proposed method consistently outperforms autoregressive modeling.
arXiv Detail & Related papers (2025-11-28T09:36:26Z) - DiffuGR: Generative Document Retrieval with Diffusion Language Models [80.78126312115087]
We propose generative document retrieval with diffusion language models, dubbed DiffuGR.<n>For inference, DiffuGR attempts to generate DocID tokens in parallel and refine them through a controllable number of denoising steps.<n>In contrast to conventional left-to-right auto-regressive decoding, DiffuGR provides a novel mechanism to first generate more confident DocID tokens.
arXiv Detail & Related papers (2025-11-11T12:00:09Z) - LLaDA-Rec: Discrete Diffusion for Parallel Semantic ID Generation in Generative Recommendation [32.284624021041004]
We propose LLaDA-Rec, a discrete diffusion framework that reformulates recommendation as parallel semantic ID generation.<n> Experiments on three real-world datasets show that LLaDA-Rec consistently outperforms both ID-based and state-of-the-art generative recommenders.
arXiv Detail & Related papers (2025-11-09T07:12:15Z) - Parallel Sampling from Masked Diffusion Models via Conditional Independence Testing [4.707859580472452]
Masked diffusion models (MDMs) offer a compelling alternative to autoregressive models (ARMs) for discrete text generation.<n>They enable parallel token sampling, rather than sequential, left-to-right generation.<n>We present PUNT, a model-agnostic sampler that reconciles this trade-off.
arXiv Detail & Related papers (2025-10-24T18:41:26Z) - Exploiting Discriminative Codebook Prior for Autoregressive Image Generation [54.14166700058777]
token-based autoregressive image generation systems first tokenize images into sequences of token indices with a codebook, and then model these sequences in an autoregressive paradigm.<n>While autoregressive generative models are trained only on index values, the prior encoded in the codebook, which contains rich token similarity information, is not exploited.<n>Recent studies have attempted to incorporate this prior by performing naive k-means clustering on the tokens, helping to facilitate the training of generative models with a reduced codebook.<n>We propose the Discriminative Codebook Prior Extractor (DCPE) as an alternative to k-means
arXiv Detail & Related papers (2025-08-14T15:00:00Z) - Generating Long Semantic IDs in Parallel for Recommendation [29.97624755406803]
We propose RPG, a lightweight framework for semantic ID-based recommendation.<n>We train the model to predict each token independently using a multi-token prediction loss.<n> Experiments show that scaling up semantic ID length to 64 enables RPG to outperform generative baselines.
arXiv Detail & Related papers (2025-06-06T06:20:37Z) - Adaptive Multi-Order Graph Regularized NMF with Dual Sparsity for Hyperspectral Unmixing [14.732511023726715]
We propose a novel adaptive multi-order graph regularized NMF method (MOGNMF) with three key features.<n>Experiments on simulated and real hyperspectral data indicate that the proposed method delivers better unmixing results.
arXiv Detail & Related papers (2025-03-25T01:44:02Z) - Distinguished Quantized Guidance for Diffusion-based Sequence Recommendation [7.6572888950554905]
We propose Distinguished Quantized Guidance for Diffusion-based Sequence Recommendation (DiQDiff)<n>DiQDiff aims to extract robust guidance to understand user interests and generate distinguished items for personalized user interests within DMs.<n>The superior recommendation performance of DiQDiff against leading approaches demonstrates its effectiveness in sequential recommendation tasks.
arXiv Detail & Related papers (2025-01-29T14:20:42Z) - GEC-DePenD: Non-Autoregressive Grammatical Error Correction with
Decoupled Permutation and Decoding [52.14832976759585]
Grammatical error correction (GEC) is an important NLP task that is usually solved with autoregressive sequence-to-sequence models.
We propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation network.
We show that the resulting network improves over previously known non-autoregressive methods for GEC.
arXiv Detail & Related papers (2023-11-14T14:24:36Z) - Enhancing Few-shot NER with Prompt Ordering based Data Augmentation [59.69108119752584]
We propose a Prompt Ordering based Data Augmentation (PODA) method to improve the training of unified autoregressive generation frameworks.
Experimental results on three public NER datasets and further analyses demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-05-19T16:25:43Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.