SMILE: SeMantic Ids Enhanced CoLd Item Representation for Click-through Rate Prediction in E-commerce SEarch
- URL: http://arxiv.org/abs/2510.12604v1
- Date: Tue, 14 Oct 2025 14:58:50 GMT
- Title: SMILE: SeMantic Ids Enhanced CoLd Item Representation for Click-through Rate Prediction in E-commerce SEarch
- Authors: Qihang Zhao, Zhongbo Sun, Xiaoyang Zheng, Xian Guo, Siyuan Wang, Zihan Liang, Mingcan Peng, Ben Chen, Chenyi Lei,
- Abstract summary: We propose SMILE, an item representation enhancement approach based on fused alignment of semantic IDs.<n>Specifically, we use RQ-OPQ encoding to quantize item content and collaborative information, followed by a two-step alignment.
- Score: 23.064077881704105
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: With the rise of modern search and recommendation platforms, insufficient collaborative information of cold-start items exacerbates the Matthew effect of existing platform items, challenging platform diversity and becoming a longstanding issue. Existing methods align items' side content with collaborative information to transfer collaborative signals from high-popularity items to cold-start items. However, these methods fail to account for the asymmetry between collaboration and content, nor the fine-grained differences among items. To address these issues, we propose SMILE, an item representation enhancement approach based on fused alignment of semantic IDs. Specifically, we use RQ-OPQ encoding to quantize item content and collaborative information, followed by a two-step alignment: RQ encoding transfers shared collaborative signals across items, while OPQ encoding learns differentiated information of items. Comprehensive offline experiments on large-scale industrial datasets demonstrate superiority of SMILE, and rigorous online A/B tests confirm statistically significant improvements: item CTR +1.66%, buyers +1.57%, and order volume +2.17%.
Related papers
- Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation [1.0839192829439435]
Hi-SAM is a Hierarchical Structure-Aware Multi-modal framework with two designs.<n>It unifies modalities via geometry-aware alignment and quantizes them via a coarse-to-fine strategy.<n> Deployed on a large-scale social platform, Hi-SAM achieved a 6.55% gain in the core online metric.
arXiv Detail & Related papers (2026-02-12T10:26:15Z) - Bridging Textual-Collaborative Gap through Semantic Codes for Sequential Recommendation [91.13055384151897]
CCFRec is a novel Code-based textual and Collaborative semantic Fusion method for sequential Recommendation.<n>We generate fine-grained semantic codes from multi-view text embeddings through vector quantization techniques.<n>In order to further enhance the fusion of textual and collaborative semantics, we introduce an optimization strategy.
arXiv Detail & Related papers (2025-03-15T15:54:44Z) - Language-Model Prior Overcomes Cold-Start Items [14.370472820496802]
The growth ofRecSys is driven by digitization and the need for personalized content in areas such as e-commerce and video streaming.
Existing solutions for the cold-start problem, such as content-based recommenders and hybrid methods, leverage item metadata to determine item similarities.
This paper introduces a novel approach for cold-start item recommendation that utilizes the language model (LM) to estimate item similarities.
arXiv Detail & Related papers (2024-11-13T22:45:52Z) - CROSS-JEM: Accurate and Efficient Cross-encoders for Short-text Ranking Tasks [12.045202648316678]
Transformer-based ranking models are the state-of-the-art approaches for such tasks.
We propose Cross-encoders with Joint Efficient Modeling (CROSS-JEM)
CROSS-JEM enables transformer-based models to jointly score multiple items for a query.
It achieves state-of-the-art accuracy and over 4x lower ranking latency over standard cross-encoders.
arXiv Detail & Related papers (2024-09-15T17:05:35Z) - Learning Multi-Aspect Item Palette: A Semantic Tokenization Framework for Generative Recommendation [55.99632509895994]
We introduce LAMIA, a novel approach for multi-aspect semantic tokenization.<n>Unlike RQ-VAE, which uses a single embedding, LAMIA learns an item palette''--a collection of independent and semantically parallel embeddings.<n>Our results demonstrate significant improvements in recommendation accuracy over existing methods.
arXiv Detail & Related papers (2024-09-11T13:49:48Z) - Reindex-Then-Adapt: Improving Large Language Models for Conversational Recommendation [50.19602159938368]
Large language models (LLMs) are revolutionizing conversational recommender systems.
We propose a Reindex-Then-Adapt (RTA) framework, which converts multi-token item titles into single tokens within LLMs.
Our framework demonstrates improved accuracy metrics across three different conversational recommendation datasets.
arXiv Detail & Related papers (2024-05-20T15:37:55Z) - What Makes Good Collaborative Views? Contrastive Mutual Information Maximization for Multi-Agent Perception [52.41695608928129]
Multi-agent perception (MAP) allows autonomous systems to understand complex environments by interpreting data from multiple sources.
This paper investigates intermediate collaboration for MAP with a specific focus on exploring "good" properties of collaborative view.
We propose a novel framework named CMiMC for intermediate collaboration.
arXiv Detail & Related papers (2024-03-15T07:18:55Z) - Hypergraph Enhanced Knowledge Tree Prompt Learning for Next-Basket
Recommendation [50.55786122323965]
Next-basket recommendation (NBR) aims to infer the items in the next basket given the corresponding basket sequence.
HEKP4NBR transforms the knowledge graph (KG) into prompts, namely Knowledge Tree Prompt (KTP), to help PLM encode the Out-Of-Vocabulary (OOV) item IDs.
A hypergraph convolutional module is designed to build a hypergraph based on item similarities measured by an MoE model from multiple aspects.
arXiv Detail & Related papers (2023-12-26T02:12:21Z) - MM-GEF: Multi-modal representation meet collaborative filtering [43.88159639990081]
We propose a graph-based item structure enhancement method MM-GEF: Multi-Modal recommendation with Graph Early-Fusion.
MM-GEF learns refined item representations by injecting structural information obtained from both multi-modal and collaborative signals.
arXiv Detail & Related papers (2023-08-14T15:47:36Z) - Multi-task Item-attribute Graph Pre-training for Strict Cold-start Item
Recommendation [71.5871100348448]
ColdGPT models item-attribute correlations into an item-attribute graph by extracting fine-grained attributes from item contents.
ColdGPT transfers knowledge into the item-attribute graph from various available data sources, i.e., item contents, historical purchase sequences, and review texts of the existing items.
Extensive experiments show that ColdGPT consistently outperforms the existing SCS recommenders by large margins.
arXiv Detail & Related papers (2023-06-26T07:04:47Z) - Better Generalization with Semantic IDs: A Case Study in Ranking for Recommendations [24.952222114424146]
We propose using content-derived features as a replacement for random ids.
We show that simply replacing ID features with content-based embeddings can cause a drop in quality due to reduced memorization capability.
Similar to content embeddings, the compactness of Semantic IDs poses a problem of easy adaption in recommendation models.
arXiv Detail & Related papers (2023-06-13T20:34:15Z) - Latent Structures Mining with Contrastive Modality Fusion for Multimedia
Recommendation [22.701371886522494]
We argue that the latent semantic item-item structures underlying multimodal contents could be beneficial for learning better item representations.
We devise a novel modality-aware structure learning module, which learns item-item relationships for each modality.
arXiv Detail & Related papers (2021-11-01T03:37:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.