Related papers: Scalable Dynamic Embedding Size Search for Streaming Recommendation

Scalable Dynamic Embedding Size Search for Streaming Recommendation

URL: http://arxiv.org/abs/2407.15411v2
Date: Wed, 31 Jul 2024 05:46:09 GMT
Title: Scalable Dynamic Embedding Size Search for Streaming Recommendation
Authors: Yunke Qu, Liang Qu, Tong Chen, Xiangyu Zhao, Quoc Viet Hung Nguyen, Hongzhi Yin,
Abstract summary: Real-world recommender systems often operate in streaming recommendation scenarios. Number of users and items continues to grow, leading to substantial storage resource consumption. We learn Lightweight Embeddings for streaming recommendation, called SCALL, which can adaptively adjust the embedding sizes of users/items.
Score: 54.28404337601801
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recommender systems typically represent users and items by learning their embeddings, which are usually set to uniform dimensions and dominate the model parameters. However, real-world recommender systems often operate in streaming recommendation scenarios, where the number of users and items continues to grow, leading to substantial storage resource consumption for these embeddings. Although a few methods attempt to mitigate this by employing embedding size search strategies to assign different embedding dimensions in streaming recommendations, they assume that the embedding size grows with the frequency of users/items, which eventually still exceeds the predefined memory budget over time. To address this issue, this paper proposes to learn Scalable Lightweight Embeddings for streaming recommendation, called SCALL, which can adaptively adjust the embedding sizes of users/items within a given memory budget over time. Specifically, we propose to sample embedding sizes from a probabilistic distribution, with the guarantee to meet any predefined memory budget. By fixing the memory budget, the proposed embedding size sampling strategy can increase and decrease the embedding sizes in accordance to the frequency of the corresponding users or items. Furthermore, we develop a reinforcement learning-based search paradigm that models each state with mean pooling to keep the length of the state vectors fixed, invariant to the changing number of users and items. As a result, the proposed method can provide embedding sizes to unseen users and items. Comprehensive empirical evaluations on two public datasets affirm the advantageous effectiveness of our proposed method.

Related papers

The Future is Sparse: Embedding Compression for Scalable Retrieval in Recommender Systems [3.034710104407876]
We describe a lightweight, learnable embedding compression technique that projects dense embeddings into a high-dimensional, sparsely activated space.<n>Our results demonstrate that leveraging sparsity is a promising approach for improving the efficiency of large-scale recommenders.
arXiv Detail & Related papers (2025-05-16T15:51:52Z)
Quantifying Memory Utilization with Effective State-Size [73.52115209375343]
We develop a measure of textitmemory utilization' This metric is tailored to the fundamental class of systems with textitinput-invariant and textitinput-varying linear operators
arXiv Detail & Related papers (2025-04-28T08:12:30Z)
A Universal Framework for Compressing Embeddings in CTR Prediction [68.27582084015044]
We introduce a Model-agnostic Embedding Compression (MEC) framework that compresses embedding tables by quantizing pre-trained embeddings. Our approach consists of two stages: first, we apply popularity-weighted regularization to balance code distribution between high- and low-frequency features. Experiments on three datasets reveal that our method reduces memory usage by over 50x while maintaining or improving recommendation performance.
arXiv Detail & Related papers (2025-02-21T10:12:34Z)
Large Memory Network for Recommendation [21.618829330517844]
Large Memory Network (LMN) is a novel idea by compressing and storing user history behavior information in a large-scale memory block. LMN has been fully deployed in Douyin E-Commerce Search (ECS), serving millions of users each day.
arXiv Detail & Related papers (2025-02-08T13:08:11Z)
Quantifying User Coherence: A Unified Framework for Cross-Domain Recommendation Analysis [69.37718774071793]
This paper introduces novel information-theoretic measures for understanding recommender systems. We evaluate 7 recommendation algorithms across 9 datasets, revealing the relationships between our measures and standard performance metrics.
arXiv Detail & Related papers (2024-10-03T13:02:07Z)
Mixed-Precision Embeddings for Large-Scale Recommendation Models [19.93156309493436]
Mixed-Precision Embeddings (MPE) is a novel embedding compression method. MPE achieves about 200x compression on the Criteo dataset without comprising the prediction accuracy.
arXiv Detail & Related papers (2024-09-30T14:04:27Z)
Dynamic Embedding Size Search with Minimum Regret for Streaming Recommender System [39.78277554870799]
We show that setting an identical and static embedding size is sub-optimal in terms of recommendation performance and memory cost. We propose a method to minimize the embedding size selection regret on both user and item sides in a non-stationary manner.
arXiv Detail & Related papers (2023-08-15T13:27:18Z)
Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data. The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships. A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z)
A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning [56.450090618578]
Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement. We show that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work. We propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel.
arXiv Detail & Related papers (2022-05-26T08:24:01Z)
Modeling Dynamic User Preference via Dictionary Learning for Sequential Recommendation [133.8758914874593]
Capturing the dynamics in user preference is crucial to better predict user future behaviors because user preferences often drift over time. Many existing recommendation algorithms -- including both shallow and deep ones -- often model such dynamics independently. This paper considers the problem of embedding a user's sequential behavior into the latent space of user preferences.
arXiv Detail & Related papers (2022-04-02T03:23:46Z)
Bayesian Non-stationary Linear Bandits for Large-Scale Recommender Systems [6.009759445555003]
We build upon the linear contextual multi-armed bandit framework to address this problem. We develop a decision-making policy for a linear bandit problem with high-dimensional feature vectors. Our proposed recommender system employs this policy to learn the users' item preferences online while minimizing runtime.
arXiv Detail & Related papers (2022-02-07T13:51:19Z)
Binary Code based Hash Embedding for Web-scale Applications [12.851057275052506]
Deep learning models are widely adopted in web-scale applications such as recommender systems, and online advertising. In these applications, embedding learning of categorical features is crucial to the success of deep learning models. We propose a binary code based hash embedding method which allows the size of the embedding table to be reduced in arbitrary scale without compromising too much performance.
arXiv Detail & Related papers (2021-08-24T11:51:15Z)
Semantically Constrained Memory Allocation (SCMA) for Embedding in Efficient Recommendation Systems [27.419109620575313]
A key challenge for deep learning models is to work with millions of categorical classes or tokens. We propose a novel formulation of memory shared embedding, where memory is shared in proportion to the overlap in semantic information. We demonstrate a significant reduction in the memory footprint while maintaining performance.
arXiv Detail & Related papers (2021-02-24T19:55:49Z)
Learning over no-Preferred and Preferred Sequence of items for Robust Recommendation [66.8722561224499]
We propose a theoretically founded sequential strategy for training large-scale Recommender Systems (RS) over implicit feedback. We present two variants of this strategy where model parameters are updated using either the momentum method or a gradient-based approach.
arXiv Detail & Related papers (2020-12-12T22:10:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.