Related papers: Tuning Word2vec for Large Scale Recommendation Systems

Tuning Word2vec for Large Scale Recommendation Systems

URL: http://arxiv.org/abs/2009.12192v1
Date: Thu, 24 Sep 2020 10:50:19 GMT
Title: Tuning Word2vec for Large Scale Recommendation Systems
Authors: Benjamin P. Chamberlain, Emanuele Rossi, Dan Shiebler, Suvash Sedhain, Michael M. Bronstein
Abstract summary: Word2vec is a powerful machine learning tool that emerged from Natural Lan-guage Processing (NLP) We show that un-constrained optimization yields an average 221% improvement in hit rate over the parameters. We demonstrate 138% average improvement in hit rate with aruntime budget-constrained hyper parameter optimization.
Score: 14.074296985040704
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Word2vec is a powerful machine learning tool that emerged from Natural Lan-guage Processing (NLP) and is now applied in multiple domains, including recom-mender systems, forecasting, and network analysis. As Word2vec is often used offthe shelf, we address the question of whether the default hyperparameters are suit-able for recommender systems. The answer is emphatically no. In this paper, wefirst elucidate the importance of hyperparameter optimization and show that un-constrained optimization yields an average 221% improvement in hit rate over thedefault parameters. However, unconstrained optimization leads to hyperparametersettings that are very expensive and not feasible for large scale recommendationtasks. To this end, we demonstrate 138% average improvement in hit rate with aruntime budget-constrained hyperparameter optimization. Furthermore, to makehyperparameter optimization applicable for large scale recommendation problemswhere the target dataset is too large to search over, we investigate generalizinghyperparameters settings from samples. We show that applying constrained hy-perparameter optimization using only a 10% sample of the data still yields a 91%average improvement in hit rate over the default parameters when applied to thefull datasets. Finally, we apply hyperparameters learned using our method of con-strained optimization on a sample to the Who To Follow recommendation serviceat Twitter and are able to increase follow rates by 15%.

Related papers

Tuning the Tuner: Introducing Hyperparameter Optimization for Auto-Tuning [0.0]
We show that even limited hyper parameter tuning can improve auto-tuner performance by 94.8% on average.<n>We establish that the hyper parameters themselves can be optimized efficiently with meta-strategies.
arXiv Detail & Related papers (2025-09-30T14:14:01Z)
Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining [56.58170370127227]
We show that optimal learning rate follows a power-law relationship with both model parameters and data sizes, while optimal batch size scales primarily with data sizes. This work is the first work that unifies different model shapes and structures, such as Mixture-of-Experts models and dense transformers.
arXiv Detail & Related papers (2025-03-06T18:58:29Z)
Scaling Exponents Across Parameterizations and Optimizers [94.54718325264218]
We propose a new perspective on parameterization by investigating a key assumption in prior work. Our empirical investigation includes tens of thousands of models trained with all combinations of threes. We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work.
arXiv Detail & Related papers (2024-07-08T12:32:51Z)
Adaptive Preference Scaling for Reinforcement Learning with Human Feedback [103.36048042664768]
Reinforcement learning from human feedback (RLHF) is a prevalent approach to align AI systems with human values. We propose a novel adaptive preference loss, underpinned by distributionally robust optimization (DRO) Our method is versatile and can be readily adapted to various preference optimization frameworks.
arXiv Detail & Related papers (2024-06-04T20:33:22Z)
Fine-Tuning Adaptive Stochastic Optimizers: Determining the Optimal Hyperparameter $ε$ via Gradient Magnitude Histogram Analysis [0.7366405857677226]
We introduce a new framework based on the empirical probability density function of the loss's magnitude, termed the "gradient magnitude histogram" We propose a novel algorithm using gradient magnitude histograms to automatically estimate a refined and accurate search space for the optimal safeguard.
arXiv Detail & Related papers (2023-11-20T04:34:19Z)
DP-HyPO: An Adaptive Private Hyperparameter Optimization Framework [31.628466186344582]
We introduce DP-HyPO, a pioneering framework for adaptive'' private hyperparameter optimization. We provide a comprehensive differential privacy analysis of our framework. We empirically demonstrate the effectiveness of DP-HyPO on a diverse set of real-world datasets.
arXiv Detail & Related papers (2023-06-09T07:55:46Z)
Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual. sensuous-aware fine-Tuning (SPT) scheme. SPT allocates trainable parameters to task-specific important positions. Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z)
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning. We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z)
Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm [97.66038345864095]
We propose a new hyperparameter optimization method with zeroth-order hyper-gradients (HOZOG) Specifically, we first formulate hyperparameter optimization as an A-based constrained optimization problem. Then, we use the average zeroth-order hyper-gradients to update hyper parameters.
arXiv Detail & Related papers (2021-02-17T21:03:05Z)
Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian Optimization and Tuning Rules [0.6875312133832078]
We build a new algorithm for evaluating and analyzing the results of the network on the training and validation sets. We use a set of tuning rules to add new hyper-parameters and/or to reduce the hyper- parameter search space to select a better combination.
arXiv Detail & Related papers (2020-06-03T08:53:48Z)
Hyperparameter Selection for Subsampling Bootstraps [0.0]
A subsampling method like BLB serves as a powerful tool for assessing the quality of estimators for massive data. The performance of the subsampling methods are highly influenced by the selection of tuning parameters. We develop a hyperparameter selection methodology, which can be used to select tuning parameters for subsampling methods. Both simulation studies and real data analysis demonstrate the superior advantage of our method.
arXiv Detail & Related papers (2020-06-02T17:10:45Z)
Implicit differentiation of Lasso-type models for hyperparameter optimization [82.73138686390514]
We introduce an efficient implicit differentiation algorithm, without matrix inversion, tailored for Lasso-type problems. Our approach scales to high-dimensional data by leveraging the sparsity of the solutions.
arXiv Detail & Related papers (2020-02-20T18:43:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.