Enhancing Embedding Representation Stability in Recommendation Systems with Semantic ID
- URL: http://arxiv.org/abs/2504.02137v1
- Date: Wed, 02 Apr 2025 21:28:38 GMT
- Title: Enhancing Embedding Representation Stability in Recommendation Systems with Semantic ID
- Authors: Carolina Zheng, Minhui Huang, Dmitrii Pedchenko, Kaushik Rangadurai, Siyu Wang, Gaby Nahum, Jie Lei, Yang Yang, Tao Liu, Zutian Luo, Xiaohan Wei, Dinesh Ramasamy, Jiyan Yang, Yiping Han, Lin Yang, Hangjun Xu, Rong Jin, Shuang Yang,
- Abstract summary: This paper introduces Semantic ID prefix ngram, a novel token parameterization technique that significantly improves the performance of the original Semantic ID.<n>We show that Semantic ID prefix ngram not only addresses embedding instability but also significantly improves tail id modeling, reduces overfitting, and mitigates representation shifts.<n>We also report our experience of integrating Semantic ID into Meta production Ads Ranking system, leading to notable performance gains and enhanced prediction stability in live deployments.
- Score: 29.560370412849874
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The exponential growth of online content has posed significant challenges to ID-based models in industrial recommendation systems, ranging from extremely high cardinality and dynamically growing ID space, to highly skewed engagement distributions, to prediction instability as a result of natural id life cycles (e.g, the birth of new IDs and retirement of old IDs). To address these issues, many systems rely on random hashing to handle the id space and control the corresponding model parameters (i.e embedding table). However, this approach introduces data pollution from multiple ids sharing the same embedding, leading to degraded model performance and embedding representation instability. This paper examines these challenges and introduces Semantic ID prefix ngram, a novel token parameterization technique that significantly improves the performance of the original Semantic ID. Semantic ID prefix ngram creates semantically meaningful collisions by hierarchically clustering items based on their content embeddings, as opposed to random assignments. Through extensive experimentation, we demonstrate that Semantic ID prefix ngram not only addresses embedding instability but also significantly improves tail id modeling, reduces overfitting, and mitigates representation shifts. We further highlight the advantages of Semantic ID prefix ngram in attention-based models that contextualize user histories, showing substantial performance improvements. We also report our experience of integrating Semantic ID into Meta production Ads Ranking system, leading to notable performance gains and enhanced prediction stability in live deployments.
Related papers
- ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models [49.09606704563898]
Person re-identification is a crucial task in computer vision, aiming to recognize individuals across non-overlapping camera views.
We propose a novel framework ChatReID, that shifts the focus towards a text-side-dominated retrieval paradigm, enabling flexible and interactive re-identification.
We introduce a hierarchical progressive tuning strategy, which endows Re-ID ability through three stages of tuning, i.e., from person attribute understanding to fine-grained image retrieval and to multi-modal task reasoning.
arXiv Detail & Related papers (2025-02-27T10:34:14Z) - Order-agnostic Identifier for Large Language Model-based Generative Recommendation [94.37662915542603]
Items are assigned identifiers for Large Language Models (LLMs) to encode user history and generate the next item.<n>Existing approaches leverage either token-sequence identifiers, representing items as discrete token sequences, or single-token identifiers, using ID or semantic embeddings.<n>We propose SETRec, which leverages semantic tokenizers to obtain order-agnostic multi-dimensional tokens.
arXiv Detail & Related papers (2025-02-15T15:25:38Z) - Break the ID-Language Barrier: An Adaption Framework for Sequential Recommendation [10.305878081909743]
We propose IDLE-Adapter, a framework that integrates pre-trained ID embeddings, rich in domain-specific knowledge, into large language models.<n>IDLE-Adapter acts as a bridge, transforming sparse user-item interaction data into dense, LLM-compatible representations.
arXiv Detail & Related papers (2024-11-27T11:59:44Z) - Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model [66.91323540178739]
Sequential recommendation (SR) aims to predict items that users may be interested in based on their historical behavior.
We revisit SR from a novel information-theoretic perspective and find that sequential modeling methods fail to adequately capture randomness and unpredictability of user behavior.
Inspired by fuzzy information processing theory, this paper introduces the fuzzy sets of interaction sequences to overcome the limitations and better capture the evolution of users' real interests.
arXiv Detail & Related papers (2024-10-31T14:52:01Z) - Synthesizing Efficient Data with Diffusion Models for Person Re-Identification Pre-Training [51.87027943520492]
We present a novel paradigm Diffusion-ReID to efficiently augment and generate diverse images based on known identities.
Benefiting from our proposed paradigm, we first create a new large-scale person Re-ID dataset Diff-Person, which consists of over 777K images from 5,183 identities.
arXiv Detail & Related papers (2024-06-10T06:26:03Z) - ASI++: Towards Distributionally Balanced End-to-End Generative Retrieval [29.65717446547002]
ASI++ is a novel fully end-to-end generative retrieval method.
It aims to simultaneously learn balanced ID assignments and improve retrieval performance.
arXiv Detail & Related papers (2024-05-23T07:54:57Z) - VILLS -- Video-Image Learning to Learn Semantics for Person Re-Identification [51.89551385538251]
We propose VILLS (Video-Image Learning to Learn Semantics), a self-supervised method that jointly learns spatial and temporal features from images and videos.
VILLS first designs a local semantic extraction module that adaptively extracts semantically consistent and robust spatial features.
Then, VILLS designs a unified feature learning and adaptation module to represent image and video modalities in a consistent feature space.
arXiv Detail & Related papers (2023-11-27T19:30:30Z) - Language Models As Semantic Indexers [78.83425357657026]
We introduce LMIndexer, a self-supervised framework to learn semantic IDs with a generative language model.
We show the high quality of the learned IDs and demonstrate their effectiveness on three tasks including recommendation, product search, and document retrieval.
arXiv Detail & Related papers (2023-10-11T18:56:15Z) - Better Generalization with Semantic IDs: A Case Study in Ranking for Recommendations [24.952222114424146]
We propose using content-derived features as a replacement for random ids.
We show that simply replacing ID features with content-based embeddings can cause a drop in quality due to reduced memorization capability.
Similar to content embeddings, the compactness of Semantic IDs poses a problem of easy adaption in recommendation models.
arXiv Detail & Related papers (2023-06-13T20:34:15Z) - Recommender Systems with Generative Retrieval [58.454606442670034]
We propose a novel generative retrieval approach, where the retrieval model autoregressively decodes the identifiers of the target candidates.
To that end, we create semantically meaningful of codewords to serve as a Semantic ID for each item.
We show that recommender systems trained with the proposed paradigm significantly outperform the current SOTA models on various datasets.
arXiv Detail & Related papers (2023-05-08T21:48:17Z) - ESA-ReID: Entropy-Based Semantic Feature Alignment for Person re-ID [7.978877859859102]
Person re-identification (re-ID) is a challenging task in real-world. Besides the typical application in surveillance system, re-ID also has significant values to improve the recall rate of people identification in content video (TV or Movies)
In this paper we propose an entropy based semantic feature alignment model, which takes advantages of the detailed information of the human semantic feature.
Considering the uncertainty of semantic segmentation, we introduce a semantic alignment with an entropy-based mask which can reduce the negative effects of mask segmentation errors.
arXiv Detail & Related papers (2020-07-09T08:56:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.