Binary Code based Hash Embedding for Web-scale Applications
- URL: http://arxiv.org/abs/2109.02471v1
- Date: Tue, 24 Aug 2021 11:51:15 GMT
- Title: Binary Code based Hash Embedding for Web-scale Applications
- Authors: Bencheng Yan, Pengjie Wang, Jinquan Liu, Wei Lin, Kuang-Chih Lee, Jian
Xu and Bo Zheng
- Abstract summary: Deep learning models are widely adopted in web-scale applications such as recommender systems, and online advertising.
In these applications, embedding learning of categorical features is crucial to the success of deep learning models.
We propose a binary code based hash embedding method which allows the size of the embedding table to be reduced in arbitrary scale without compromising too much performance.
- Score: 12.851057275052506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, deep learning models are widely adopted in web-scale applications
such as recommender systems, and online advertising. In these applications,
embedding learning of categorical features is crucial to the success of deep
learning models. In these models, a standard method is that each categorical
feature value is assigned a unique embedding vector which can be learned and
optimized. Although this method can well capture the characteristics of the
categorical features and promise good performance, it can incur a huge memory
cost to store the embedding table, especially for those web-scale applications.
Such a huge memory cost significantly holds back the effectiveness and
usability of EDRMs. In this paper, we propose a binary code based hash
embedding method which allows the size of the embedding table to be reduced in
arbitrary scale without compromising too much performance. Experimental
evaluation results show that one can still achieve 99\% performance even if the
embedding table size is reduced 1000$\times$ smaller than the original one with
our proposed method.
Related papers
- Scalable Dynamic Embedding Size Search for Streaming Recommendation [54.28404337601801]
Real-world recommender systems often operate in streaming recommendation scenarios.
Number of users and items continues to grow, leading to substantial storage resource consumption.
We learn Lightweight Embeddings for streaming recommendation, called SCALL, which can adaptively adjust the embedding sizes of users/items.
arXiv Detail & Related papers (2024-07-22T06:37:24Z) - Fine-Grained Embedding Dimension Optimization During Training for Recommender Systems [17.602059421895856]
FIITED is a system to automatically reduce the memory footprint via FIne-grained In-Training Embedding Dimension pruning.
We show that FIITED can reduce DLRM embedding size by more than 65% while preserving model quality.
On public datasets, FIITED can reduce the size of embedding tables by 2.1x to 800x with negligible accuracy drop.
arXiv Detail & Related papers (2024-01-09T08:04:11Z) - On the Embedding Collapse when Scaling up Recommendation Models [53.66285358088788]
We identify the embedding collapse phenomenon as the inhibition of scalability, wherein the embedding matrix tends to occupy a low-dimensional subspace.
We propose a simple yet effective multi-embedding design incorporating embedding-set-specific interaction modules to learn embedding sets with large diversity.
arXiv Detail & Related papers (2023-10-06T17:50:38Z) - Adaptive Cross Batch Normalization for Metric Learning [75.91093210956116]
Metric learning is a fundamental problem in computer vision.
We show that it is equally important to ensure that the accumulated embeddings are up to date.
In particular, it is necessary to circumvent the representational drift between the accumulated embeddings and the feature embeddings at the current training iteration.
arXiv Detail & Related papers (2023-03-30T03:22:52Z) - A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental
Learning [56.450090618578]
Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement.
We show that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work.
We propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel.
arXiv Detail & Related papers (2022-05-26T08:24:01Z) - Learning to Collide: Recommendation System Model Compression with
Learned Hash Functions [4.6994057182972595]
A key characteristic of deep recommendation models is the immense memory requirements of their embedding tables.
A common technique to reduce model size is to hash all of the categorical variable identifiers (ids) into a smaller space.
This hashing reduces the number of unique representations that must be stored in the embedding table; thus decreasing its size.
We introduce an alternative approach, Learned Hash Functions, which instead learns a new mapping function that encourages collisions between semantically similar ids.
arXiv Detail & Related papers (2022-03-28T06:07:30Z) - Learning Effective and Efficient Embedding via an Adaptively-Masked
Twins-based Layer [15.403616481651383]
We propose an Adaptively-Masked Twins-based Layer (AMTL) behind the standard embedding layer.
AMTL generates a mask vector to mask the undesired dimensions for each embedding vector.
The mask vector brings flexibility in selecting the dimensions and the proposed layer can be easily added to either untrained or trained DLRMs.
arXiv Detail & Related papers (2021-08-24T11:50:49Z) - Semantically Constrained Memory Allocation (SCMA) for Embedding in
Efficient Recommendation Systems [27.419109620575313]
A key challenge for deep learning models is to work with millions of categorical classes or tokens.
We propose a novel formulation of memory shared embedding, where memory is shared in proportion to the overlap in semantic information.
We demonstrate a significant reduction in the memory footprint while maintaining performance.
arXiv Detail & Related papers (2021-02-24T19:55:49Z) - Learnable Embedding Sizes for Recommender Systems [34.98757041815557]
We propose PEP (short for Plug-in Embedding Pruning) to reduce the size of the embedding table while avoiding the drop of recommendation accuracy.
PEP achieves strong recommendation performance while reducing 97-99% parameters.
PEP only brings an additional 20-30% time cost compared with base models.
arXiv Detail & Related papers (2021-01-19T11:50:33Z) - MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down
Distillation [153.56211546576978]
In this work, we propose that better soft targets with higher compatibil-ity can be generated by using a label generator.
We can employ the meta-learning technique to optimize this label generator.
The experiments are conducted on two standard classificationbenchmarks, namely CIFAR-100 and ILSVRC2012.
arXiv Detail & Related papers (2020-08-27T13:04:27Z) - ReMarNet: Conjoint Relation and Margin Learning for Small-Sample Image
Classification [49.87503122462432]
We introduce a novel neural network termed Relation-and-Margin learning Network (ReMarNet)
Our method assembles two networks of different backbones so as to learn the features that can perform excellently in both of the aforementioned two classification mechanisms.
Experiments on four image datasets demonstrate that our approach is effective in learning discriminative features from a small set of labeled samples.
arXiv Detail & Related papers (2020-06-27T13:50:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.