Related papers: SIDE: Semantic ID Embedding for effective learning from sequences

SIDE: Semantic ID Embedding for effective learning from sequences

URL: http://arxiv.org/abs/2506.16698v1
Date: Fri, 20 Jun 2025 02:40:38 GMT
Title: SIDE: Semantic ID Embedding for effective learning from sequences
Authors: Dinesh Ramasamy, Shakti Kumar, Chris Cadonic, Jiaxin Yang, Sohini Roychowdhury, Esam Abdel Rhman, Srihari Reddy,
Abstract summary: Sequence-based recommendations systems are driving the state-of-the-art for industrial ad-recommendation systems.<n>We propose a novel approach that leverages vector quantization (VQ) to inject a compact Semantic ID (SID) as input to the recommendation models.<n>The proposed enhancements when applied to a large-scale industrial ads-recommendation system achieves 2.4X improvement in normalized entropy (NE) gain and 3X reduction in data footprint.
Score: 1.2145532233226686
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sequence-based recommendations models are driving the state-of-the-art for industrial ad-recommendation systems. Such systems typically deal with user histories or sequence lengths ranging in the order of O(10^3) to O(10^4) events. While adding embeddings at this scale is manageable in pre-trained models, incorporating them into real-time prediction models is challenging due to both storage and inference costs. To address this scaling challenge, we propose a novel approach that leverages vector quantization (VQ) to inject a compact Semantic ID (SID) as input to the recommendation models instead of a collection of embeddings. Our method builds on recent works of SIDs by introducing three key innovations: (i) a multi-task VQ-VAE framework, called VQ fusion that fuses multiple content embeddings and categorical predictions into a single Semantic ID; (ii) a parameter-free, highly granular SID-to-embedding conversion technique, called SIDE, that is validated with two content embedding collections, thereby eliminating the need for a large parameterized lookup table; and (iii) a novel quantization method called Discrete-PCA (DPCA) which generalizes and enhances residual quantization techniques. The proposed enhancements when applied to a large-scale industrial ads-recommendation system achieves 2.4X improvement in normalized entropy (NE) gain and 3X reduction in data footprint compared to traditional SID methods.

Related papers

MoToRec: Sparse-Regularized Multimodal Tokenization for Cold-Start Recommendation [6.78317230214304]
We present Sparse-Regularized Multimodal Tokenization for Cold-Start Recommendation (MoToRec)<n>MoToRec is a framework centered on a sparsely-regularized Residual Quantized Variational Autoencoder (RQ-VAE) that generates a compositional semantic code of discrete, interpretable tokens.<n>Extensive experiments on three large-scale datasets demonstrate MoToRec's superiority over state-of-the-art methods in both overall and cold-start scenarios.
arXiv Detail & Related papers (2026-02-11T17:31:14Z)
GLASS: A Generative Recommender for Long-sequence Modeling via SID-Tier and Semantic Search [51.44490997013772]
GLASS is a novel framework that integrates long-term user interests into the generative process via SID-Tier and Semantic Search.<n>We show that GLASS outperforms state-of-the-art baselines in experiments on two large-scale real-world datasets.
arXiv Detail & Related papers (2026-02-05T13:48:33Z)
Rethinking Generative Recommender Tokenizer: Recsys-Native Encoding and Semantic Quantization Beyond LLMs [17.944727019161878]
ReSID is a principled, SID framework that recommend learning from the perspective of information preservation and sequential predictability.<n>It consistently outperforms strong sequential and SID-based generative baselines by an average of over 10%, while reducing tokenization cost by up to 122x.
arXiv Detail & Related papers (2026-02-02T17:00:04Z)
Masked Diffusion Generative Recommendation [14.679550929790151]
Generative recommendation (GR) typically first quantizes continuous item embeddings into multi-level semantic IDs (SIDs)<n>We propose MDGR, a Masked Diffusion Generative Recommendation framework that reshapes the GR pipeline from three perspectives: codebook, training, and inference.
arXiv Detail & Related papers (2026-01-27T11:39:02Z)
LLaDA-Rec: Discrete Diffusion for Parallel Semantic ID Generation in Generative Recommendation [32.284624021041004]
We propose LLaDA-Rec, a discrete diffusion framework that reformulates recommendation as parallel semantic ID generation.<n> Experiments on three real-world datasets show that LLaDA-Rec consistently outperforms both ID-based and state-of-the-art generative recommenders.
arXiv Detail & Related papers (2025-11-09T07:12:15Z)
Representation Quantization for Collaborative Filtering Augmentation [49.14087936092634]
We propose a novel two-stage collaborative recommendation algorithm, DQRec.<n>It augments features and homogeneous linkages by extracting behavior characteristics jointly from interaction sequences and attributes.<n>By integrating these semantic ID patterns into the recommendation process through feature and linkage augmentation, the system enriches both latent and explicit user and item features.
arXiv Detail & Related papers (2025-08-15T04:00:50Z)
HiD-VAE: Interpretable Generative Recommendation via Hierarchical and Disentangled Semantic IDs [33.51075655987504]
HiD-VAE is a novel framework that learns hierarchically disentangled item representations through two core innovations.<n>First, HiD-VAE pioneers a hierarchically-supervised quantization process that aligns discrete codes with multi-level item tags.<n>Second, to combat representation entanglement, HiD-VAE incorporates a novel uniqueness loss that directly penalizes latent space overlap.
arXiv Detail & Related papers (2025-08-06T16:45:05Z)
SPARE: Single-Pass Annotation with Reference-Guided Evaluation for Automatic Process Supervision and Reward Modelling [70.01883340129204]
Single-Pass.<n>with Reference-Guided Evaluation (SPARE)<n>Novel structured framework that enables single-pass, per-step annotation by aligning each solution step to one or multiple steps in a reference solution, accompanied by explicit reasoning for evaluation.<n>SPARE achieves competitive performance on challenging mathematical datasets while offering 2.6 times greater efficiency, requiring only 38% of the runtime.
arXiv Detail & Related papers (2025-06-18T14:37:59Z)
SPARKE: Scalable Prompt-Aware Diversity Guidance in Diffusion Models via RKE Score [16.00815718886712]
Diffusion models have demonstrated remarkable success in high-fidelity image synthesis and prompt-guided generative modeling.<n>We propose the Scalable Prompt-Aware R'eny Kernel Entropy Diversity Guidance (SPARKE) method for prompt-aware diversity guidance.<n>We numerically test the SPARKE method on several text-to-image diffusion models, demonstrating that the proposed method improves the prompt-aware diversity of the generated data without incurring significant computational costs.
arXiv Detail & Related papers (2025-06-11T20:53:45Z)
A Novel Mamba-based Sequential Recommendation Method [4.941272356564765]
Sequential recommendation (SR) encodes user activity to predict the next action.<n> Transformer-based models have proven effective for sequential recommendation, but the complexity of the self-attention module in Transformers scales quadratically with the sequence length.<n>We propose a novel multi-head latent Mamba architecture, which employs multiple low-dimensional Mamba layers and fully connected layers.
arXiv Detail & Related papers (2025-04-10T02:43:19Z)
BBQRec: Behavior-Bind Quantization for Multi-Modal Sequential Recommendation [15.818669767036592]
We propose a Behavior-Bind multi-modal Quantization for Sequential Recommendation (BBQRec) featuring dual-aligned quantization and semantics-aware sequence modeling.<n>BBQRec disentangles modality-agnostic behavioral patterns from noisy modality-specific features through contrastive codebook learning.<n>We design a discretized similarity reweighting mechanism that dynamically adjusts self-attention scores using quantized semantic relationships.
arXiv Detail & Related papers (2025-04-09T07:19:48Z)
Pre-train, Align, and Disentangle: Empowering Sequential Recommendation with Large Language Models [26.331324261505486]
Sequential Recommendation (SR) aims to leverage the sequential patterns in users' historical interactions to accurately track their preferences.<n>Despite the proven effectiveness of large language models (LLMs), their integration into commercial recommender systems is impeded.<n>We introduce a novel Pre-train, Align, and Disentangle (PAD) framework to enhance SR models with LLMs.
arXiv Detail & Related papers (2024-12-05T12:17:56Z)
Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model [66.91323540178739]
Sequential recommendation (SR) aims to predict items that users may be interested in based on their historical behavior. We revisit SR from a novel information-theoretic perspective and find that sequential modeling methods fail to adequately capture randomness and unpredictability of user behavior. Inspired by fuzzy information processing theory, this paper introduces the fuzzy sets of interaction sequences to overcome the limitations and better capture the evolution of users' real interests.
arXiv Detail & Related papers (2024-10-31T14:52:01Z)
Learning Multi-Aspect Item Palette: A Semantic Tokenization Framework for Generative Recommendation [55.99632509895994]
We introduce LAMIA, a novel approach for multi-aspect semantic tokenization.<n>Unlike RQ-VAE, which uses a single embedding, LAMIA learns an item palette''--a collection of independent and semantically parallel embeddings.<n>Our results demonstrate significant improvements in recommendation accuracy over existing methods.
arXiv Detail & Related papers (2024-09-11T13:49:48Z)
Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes. Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models. Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z)
Mutual Exclusivity Training and Primitive Augmentation to Induce Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models. We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples. We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.