Tuning Word2vec for Large Scale Recommendation Systems
- URL: http://arxiv.org/abs/2009.12192v1
- Date: Thu, 24 Sep 2020 10:50:19 GMT
- Title: Tuning Word2vec for Large Scale Recommendation Systems
- Authors: Benjamin P. Chamberlain, Emanuele Rossi, Dan Shiebler, Suvash Sedhain,
Michael M. Bronstein
- Abstract summary: Word2vec is a powerful machine learning tool that emerged from Natural Lan-guage Processing (NLP)
We show that un-constrained optimization yields an average 221% improvement in hit rate over the parameters.
We demonstrate 138% average improvement in hit rate with aruntime budget-constrained hyper parameter optimization.
- Score: 14.074296985040704
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Word2vec is a powerful machine learning tool that emerged from Natural
Lan-guage Processing (NLP) and is now applied in multiple domains, including
recom-mender systems, forecasting, and network analysis. As Word2vec is often
used offthe shelf, we address the question of whether the default
hyperparameters are suit-able for recommender systems. The answer is
emphatically no. In this paper, wefirst elucidate the importance of
hyperparameter optimization and show that un-constrained optimization yields an
average 221% improvement in hit rate over thedefault parameters. However,
unconstrained optimization leads to hyperparametersettings that are very
expensive and not feasible for large scale recommendationtasks. To this end, we
demonstrate 138% average improvement in hit rate with aruntime
budget-constrained hyperparameter optimization. Furthermore, to
makehyperparameter optimization applicable for large scale recommendation
problemswhere the target dataset is too large to search over, we investigate
generalizinghyperparameters settings from samples. We show that applying
constrained hy-perparameter optimization using only a 10% sample of the data
still yields a 91%average improvement in hit rate over the default parameters
when applied to thefull datasets. Finally, we apply hyperparameters learned
using our method of con-strained optimization on a sample to the Who To Follow
recommendation serviceat Twitter and are able to increase follow rates by 15%.
Related papers
- Scaling Exponents Across Parameterizations and Optimizers [94.54718325264218]
We propose a new perspective on parameterization by investigating a key assumption in prior work.
Our empirical investigation includes tens of thousands of models trained with all combinations of threes.
We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work.
arXiv Detail & Related papers (2024-07-08T12:32:51Z) - Adaptive Preference Scaling for Reinforcement Learning with Human Feedback [103.36048042664768]
Reinforcement learning from human feedback (RLHF) is a prevalent approach to align AI systems with human values.
We propose a novel adaptive preference loss, underpinned by distributionally robust optimization (DRO)
Our method is versatile and can be readily adapted to various preference optimization frameworks.
arXiv Detail & Related papers (2024-06-04T20:33:22Z) - Fine-Tuning Adaptive Stochastic Optimizers: Determining the Optimal Hyperparameter $ε$ via Gradient Magnitude Histogram Analysis [0.7366405857677226]
We introduce a new framework based on the empirical probability density function of the loss's magnitude, termed the "gradient magnitude histogram"
We propose a novel algorithm using gradient magnitude histograms to automatically estimate a refined and accurate search space for the optimal safeguard.
arXiv Detail & Related papers (2023-11-20T04:34:19Z) - DP-HyPO: An Adaptive Private Hyperparameter Optimization Framework [31.628466186344582]
We introduce DP-HyPO, a pioneering framework for adaptive'' private hyperparameter optimization.
We provide a comprehensive differential privacy analysis of our framework.
We empirically demonstrate the effectiveness of DP-HyPO on a diverse set of real-world datasets.
arXiv Detail & Related papers (2023-06-09T07:55:46Z) - Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual.
sensuous-aware fine-Tuning (SPT) scheme.
SPT allocates trainable parameters to task-specific important positions.
Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm [97.66038345864095]
We propose a new hyperparameter optimization method with zeroth-order hyper-gradients (HOZOG)
Specifically, we first formulate hyperparameter optimization as an A-based constrained optimization problem.
Then, we use the average zeroth-order hyper-gradients to update hyper parameters.
arXiv Detail & Related papers (2021-02-17T21:03:05Z) - Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian
Optimization and Tuning Rules [0.6875312133832078]
We build a new algorithm for evaluating and analyzing the results of the network on the training and validation sets.
We use a set of tuning rules to add new hyper-parameters and/or to reduce the hyper- parameter search space to select a better combination.
arXiv Detail & Related papers (2020-06-03T08:53:48Z) - Hyperparameter Selection for Subsampling Bootstraps [0.0]
A subsampling method like BLB serves as a powerful tool for assessing the quality of estimators for massive data.
The performance of the subsampling methods are highly influenced by the selection of tuning parameters.
We develop a hyperparameter selection methodology, which can be used to select tuning parameters for subsampling methods.
Both simulation studies and real data analysis demonstrate the superior advantage of our method.
arXiv Detail & Related papers (2020-06-02T17:10:45Z) - Implicit differentiation of Lasso-type models for hyperparameter
optimization [82.73138686390514]
We introduce an efficient implicit differentiation algorithm, without matrix inversion, tailored for Lasso-type problems.
Our approach scales to high-dimensional data by leveraging the sparsity of the solutions.
arXiv Detail & Related papers (2020-02-20T18:43:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.