Memory-Efficient Factorization Machines via Binarizing both Data and
Model Coefficients
- URL: http://arxiv.org/abs/2108.07421v1
- Date: Tue, 17 Aug 2021 03:30:52 GMT
- Title: Memory-Efficient Factorization Machines via Binarizing both Data and
Model Coefficients
- Authors: Yu Geng and Liang Lan
- Abstract summary: Subspace imating machine (SEFM) has been proposed to overcome the limitation of Factorization Machines (FM)
We propose a new method called Binarized FM which constraints the model parameters to be binary values.
Our proposed method achieves comparable accuracy with SEFM but with much less memory cost.
- Score: 9.692334398809457
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Factorization Machines (FM), a general predictor that can efficiently model
feature interactions in linear time, was primarily proposed for collaborative
recommendation and have been broadly used for regression, classification and
ranking tasks. Subspace Encoding Factorization Machine (SEFM) has been proposed
recently to overcome the expressiveness limitation of Factorization Machines
(FM) by applying explicit nonlinear feature mapping for both individual
features and feature interactions through one-hot encoding to each input
feature. Despite the effectiveness of SEFM, it increases the memory cost of FM
by $b$ times, where $b$ is the number of bins when applying one-hot encoding on
each input feature. To reduce the memory cost of SEFM, we propose a new method
called Binarized FM which constraints the model parameters to be binary values
(i.e., 1 or $-1$). Then each parameter value can be efficiently stored in one
bit. Our proposed method can significantly reduce the memory cost of SEFM
model. In addition, we propose a new algorithm to effectively and efficiently
learn proposed FM with binary constraints using Straight Through Estimator
(STE) with Adaptive Gradient Descent (Adagrad). Finally, we evaluate the
performance of our proposed method on eight different classification datasets.
Our experimental results have demonstrated that our proposed method achieves
comparable accuracy with SEFM but with much less memory cost.
Related papers
- Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - Tuning Pre-trained Model via Moment Probing [62.445281364055795]
We propose a novel Moment Probing (MP) method to explore the potential of LP.
MP performs a linear classification head based on the mean of final features.
Our MP significantly outperforms LP and is competitive with counterparts at less training cost.
arXiv Detail & Related papers (2023-07-21T04:15:02Z) - Fine-Tuning Language Models with Just Forward Passes [92.04219196752007]
Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but as LMs grow in size, backpropagation requires a large amount of memory.
We propose a memory-efficient zerothorder (MeZO) to operate in-place, thereby fine-tuning LMs with the same memory footprint as inference.
arXiv Detail & Related papers (2023-05-27T02:28:10Z) - Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision
Processes [80.89852729380425]
We propose the first computationally efficient algorithm that achieves the nearly minimax optimal regret $tilde O(dsqrtH3K)$.
Our work provides a complete answer to optimal RL with linear MDPs, and the developed algorithm and theoretical tools may be of independent interest.
arXiv Detail & Related papers (2022-12-12T18:58:59Z) - Boosting Factorization Machines via Saliency-Guided Mixup [125.15872106335692]
We present MixFM, inspired by Mixup, to generate auxiliary training data to boost Factorization machines (FMs)
We also put forward a novel Factorization Machine powered by Saliency-guided Mixup (denoted as SMFM)
arXiv Detail & Related papers (2022-06-17T09:49:00Z) - On Computing the Hyperparameter of Extreme Learning Machines: Algorithm
and Application to Computational PDEs, and Comparison with Classical and
High-Order Finite Elements [0.0]
We consider the use of extreme learning machines (ELM) for computational partial differential equations (PDE)
In ELM the hidden-layer coefficients in the neural network are assigned to random values generated on $[-R_m,R_m]$ and fixed.
We present a method for computing the optimal value of $R_m$ based on the differential evolution algorithm.
arXiv Detail & Related papers (2021-10-27T02:05:26Z) - Joint Majorization-Minimization for Nonnegative Matrix Factorization
with the $\beta$-divergence [4.468952886990851]
This article proposes new multiplicative updates for nonnegative matrix factorization (NMF) with the $beta$-divergence objective function.
We report experimental results using diverse datasets: face images, an audio spectrogram, hyperspectral data and song play counts.
arXiv Detail & Related papers (2021-06-29T09:58:21Z) - Factorization Machines with Regularization for Sparse Feature
Interactions [13.593781209611112]
Factorization machines (FMs) are machine learning predictive models based on second-order feature interactions.
We present a new regularization scheme for feature interaction selection in FMs.
For feature interaction selection, our proposed regularizer makes the feature interaction matrix sparse without a restriction on sparsity patterns imposed by the existing methods.
arXiv Detail & Related papers (2020-10-19T05:00:40Z) - Memory and Computation-Efficient Kernel SVM via Binary Embedding and
Ternary Model Coefficients [18.52747917850984]
Kernel approximation is widely used to scale up kernel SVM training and prediction.
Memory and computation costs of kernel approximation models are still too high if we want to deploy them on memory-limited devices.
We propose a novel memory and computation-efficient kernel SVM model by using both binary embedding and binary model coefficients.
arXiv Detail & Related papers (2020-10-06T09:41:54Z) - Efficient Learning of Generative Models via Finite-Difference Score
Matching [111.55998083406134]
We present a generic strategy to efficiently approximate any-order directional derivative with finite difference.
Our approximation only involves function evaluations, which can be executed in parallel, and no gradient computations.
arXiv Detail & Related papers (2020-07-07T10:05:01Z) - DS-FACTO: Doubly Separable Factorization Machines [4.281959480566438]
Factorization Machines (FM) are powerful class of models that incorporate higher-order interaction among features to add more expressive power to linear models.
Despite using a low-rank representation for the pairwise features, the memory overheads of using factorization machines on large-scale real-world datasets can be prohibitively high.
Traditional algorithms for FM which work on a single-machine are not equipped to handle this scale and therefore, using a distributed algorithm to parallelize computation across a cluster is inevitable.
arXiv Detail & Related papers (2020-04-29T03:36:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.