Better Generalization with Semantic IDs: A Case Study in Ranking for Recommendations
- URL: http://arxiv.org/abs/2306.08121v2
- Date: Thu, 30 May 2024 05:53:39 GMT
- Title: Better Generalization with Semantic IDs: A Case Study in Ranking for Recommendations
- Authors: Anima Singh, Trung Vu, Nikhil Mehta, Raghunandan Keshavan, Maheswaran Sathiamoorthy, Yilin Zheng, Lichan Hong, Lukasz Heldt, Li Wei, Devansh Tandon, Ed H. Chi, Xinyang Yi,
- Abstract summary: We propose using content-derived features as a replacement for random ids.
We show that simply replacing ID features with content-based embeddings can cause a drop in quality due to reduced memorization capability.
Similar to content embeddings, the compactness of Semantic IDs poses a problem of easy adaption in recommendation models.
- Score: 24.952222114424146
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Randomly-hashed item ids are used ubiquitously in recommendation models. However, the learned representations from random hashing prevents generalization across similar items, causing problems of learning unseen and long-tail items, especially when item corpus is large, power-law distributed, and evolving dynamically. In this paper, we propose using content-derived features as a replacement for random ids. We show that simply replacing ID features with content-based embeddings can cause a drop in quality due to reduced memorization capability. To strike a good balance of memorization and generalization, we propose to use Semantic IDs -- a compact discrete item representation learned from frozen content embeddings using RQ-VAE that captures the hierarchy of concepts in items -- as a replacement for random item ids. Similar to content embeddings, the compactness of Semantic IDs poses a problem of easy adaption in recommendation models. We propose novel methods for adapting Semantic IDs in industry-scale ranking models, through hashing sub-pieces of of the Semantic-ID sequences. In particular, we find that the SentencePiece model that is commonly used in LLM tokenization outperforms manually crafted pieces such as N-grams. To the end, we evaluate our approaches in a real-world ranking model for YouTube recommendations. Our experiments demonstrate that Semantic IDs can replace the direct use of video IDs by improving the generalization ability on new and long-tail item slices without sacrificing overall model quality.
Related papers
- AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection [72.41427550339296]
We introduce AnyMaker, a framework capable of generating general objects with high ID fidelity and flexible text editability.
The efficacy of AnyMaker stems from its novel general ID extraction, dual-level ID injection, and ID-aware decoupling.
To validate our approach and boost the research of general object customization, we create the first large-scale general ID dataset.
arXiv Detail & Related papers (2024-06-17T15:26:22Z) - Multi-Prompts Learning with Cross-Modal Alignment for Attribute-based
Person Re-Identification [18.01407937934588]
We present a new framework called Multi-Prompts ReID (MP-ReID) based on prompt learning and language models.
MP-ReID learns to hallucinate diverse, informative, and promptable sentences for describing the query images.
Explicit prompts are obtained by ensembling generation models, such as ChatGPT and VQA models.
arXiv Detail & Related papers (2023-12-28T03:00:19Z) - Summarization-Based Document IDs for Generative Retrieval with Language Models [65.11811787587403]
We introduce summarization-based document IDs, in which each document's ID is composed of an extractive summary or abstractive keyphrases.
We show that using ACID improves top-10 and top-20 recall by 15.6% and 14.4% (relative) respectively.
We also observed that extractive IDs outperformed abstractive IDs on Wikipedia articles in NQ but not the snippets in MSMARCO.
arXiv Detail & Related papers (2023-11-14T23:28:36Z) - ID Embedding as Subtle Features of Content and Structure for Multimodal Recommendation [13.338363107777438]
We propose a novel recommendation model by incorporating ID embeddings to enhance the salient features of both content and structure.
Our method is superior to state-of-the-art multimodal recommendation methods and the effectiveness of fine-grained ID embeddings.
arXiv Detail & Related papers (2023-11-10T09:41:28Z) - Language Models As Semantic Indexers [78.83425357657026]
We introduce LMIndexer, a self-supervised framework to learn semantic IDs with a generative language model.
We show the high quality of the learned IDs and demonstrate their effectiveness on three tasks including recommendation, product search, and document retrieval.
arXiv Detail & Related papers (2023-10-11T18:56:15Z) - Exploring Fine-Grained Representation and Recomposition for Cloth-Changing Person Re-Identification [78.52704557647438]
We propose a novel FIne-grained Representation and Recomposition (FIRe$2$) framework to tackle both limitations without any auxiliary annotation or data.
Experiments demonstrate that FIRe$2$ can achieve state-of-the-art performance on five widely-used cloth-changing person Re-ID benchmarks.
arXiv Detail & Related papers (2023-08-21T12:59:48Z) - Recommender Systems with Generative Retrieval [58.454606442670034]
We propose a novel generative retrieval approach, where the retrieval model autoregressively decodes the identifiers of the target candidates.
To that end, we create semantically meaningful of codewords to serve as a Semantic ID for each item.
We show that recommender systems trained with the proposed paradigm significantly outperform the current SOTA models on various datasets.
arXiv Detail & Related papers (2023-05-08T21:48:17Z) - CycAs: Self-supervised Cycle Association for Learning Re-identifiable
Descriptions [61.724894233252414]
This paper proposes a self-supervised learning method for the person re-identification (re-ID) problem.
Existing unsupervised methods usually rely on pseudo labels, such as those from video tracklets or clustering.
We introduce a different unsupervised method that allows us to learn pedestrian embeddings from raw videos, without resorting to pseudo labels.
arXiv Detail & Related papers (2020-07-15T09:52:35Z) - ESA-ReID: Entropy-Based Semantic Feature Alignment for Person re-ID [7.978877859859102]
Person re-identification (re-ID) is a challenging task in real-world. Besides the typical application in surveillance system, re-ID also has significant values to improve the recall rate of people identification in content video (TV or Movies)
In this paper we propose an entropy based semantic feature alignment model, which takes advantages of the detailed information of the human semantic feature.
Considering the uncertainty of semantic segmentation, we introduce a semantic alignment with an entropy-based mask which can reduce the negative effects of mask segmentation errors.
arXiv Detail & Related papers (2020-07-09T08:56:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.