Basis Function Encoding of Numerical Features in Factorization Machines
for Improved Accuracy
- URL: http://arxiv.org/abs/2305.14528v1
- Date: Tue, 23 May 2023 21:10:17 GMT
- Title: Basis Function Encoding of Numerical Features in Factorization Machines
for Improved Accuracy
- Authors: Alex Shtoff and Elie Abboud and Rotem Stram and Oren Somekh
- Abstract summary: We provide a systematic and theoretically-justified way to incorporate numerical features into FM variants.
We show that our technique yields a model that learns segmentized functions of the numerical feature spanned by the set of functions of one's choice.
Our technique preserves fast training and inference, and requires only a small modification of the computational graph of an FM model.
- Score: 2.3022070933226217
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Factorization machine (FM) variants are widely used for large scale real-time
content recommendation systems, since they offer an excellent balance between
model accuracy and low computational costs for training and inference. These
systems are trained on tabular data with both numerical and categorical
columns. Incorporating numerical columns poses a challenge, and they are
typically incorporated using a scalar transformation or binning, which can be
either learned or chosen a-priori. In this work, we provide a systematic and
theoretically-justified way to incorporate numerical features into FM variants
by encoding them into a vector of function values for a set of functions of
one's choice.
We view factorization machines as approximators of segmentized functions,
namely, functions from a field's value to the real numbers, assuming the
remaining fields are assigned some given constants, which we refer to as the
segment. From this perspective, we show that our technique yields a model that
learns segmentized functions of the numerical feature spanned by the set of
functions of one's choice, namely, the spanning coefficients vary between
segments. Hence, to improve model accuracy we advocate the use of functions
known to have strong approximation power, and offer the B-Spline basis due to
its well-known approximation power, availability in software libraries, and
efficiency. Our technique preserves fast training and inference, and requires
only a small modification of the computational graph of an FM model. Therefore,
it is easy to incorporate into an existing system to improve its performance.
Finally, we back our claims with a set of experiments, including synthetic,
performance evaluation on several data-sets, and an A/B test on a real online
advertising system which shows improved performance.
Related papers
- Fast and interpretable Support Vector Classification based on the truncated ANOVA decomposition [0.0]
Support Vector Machines (SVMs) are an important tool for performing classification on scattered data.
We propose solving SVMs in primal form using feature maps based on trigonometric functions or wavelets.
arXiv Detail & Related papers (2024-02-04T10:27:42Z) - Tuning Pre-trained Model via Moment Probing [62.445281364055795]
We propose a novel Moment Probing (MP) method to explore the potential of LP.
MP performs a linear classification head based on the mean of final features.
Our MP significantly outperforms LP and is competitive with counterparts at less training cost.
arXiv Detail & Related papers (2023-07-21T04:15:02Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - FAStEN: An Efficient Adaptive Method for Feature Selection and Estimation in High-Dimensional Functional Regressions [7.674715791336311]
We propose a new, flexible and ultra-efficient approach to perform feature selection in a sparse function-on-function regression problem.
We show how to extend it to the scalar-on-function framework.
We present an application to brain fMRI data from the AOMIC PIOP1 study.
arXiv Detail & Related papers (2023-03-26T19:41:17Z) - FRANS: Automatic Feature Extraction for Time Series Forecasting [2.3226893628361682]
We develop an autonomous Feature Retrieving Autoregressive Network for Static features that does not require domain knowledge.
Our results show that our features lead to improvement in accuracy in most situations.
arXiv Detail & Related papers (2022-09-15T03:14:59Z) - Learning Operators with Coupled Attention [9.715465024071333]
We propose a novel operator learning method, LOCA, motivated from the recent success of the attention mechanism.
In our architecture the input functions are mapped to a finite set of features which are then averaged with attention weights that depend on the output query locations.
By coupling these attention weights together with an integral transform, LOCA is able to explicitly learn correlations in the target output functions.
arXiv Detail & Related papers (2022-01-04T08:22:03Z) - Feature Weighted Non-negative Matrix Factorization [92.45013716097753]
We propose the Feature weighted Non-negative Matrix Factorization (FNMF) in this paper.
FNMF learns the weights of features adaptively according to their importances.
It can be solved efficiently with the suggested optimization algorithm.
arXiv Detail & Related papers (2021-03-24T21:17:17Z) - Towards Explainable Exploratory Landscape Analysis: Extreme Feature
Selection for Classifying BBOB Functions [4.932130498861987]
We show that a surprisingly small number of features -- often less than four -- can suffice to achieve a 98% accuracy.
We show that the classification accuracy transfers to settings in which several instances are involved in training and testing.
arXiv Detail & Related papers (2021-02-01T10:04:28Z) - Learning Set Functions that are Sparse in Non-Orthogonal Fourier Bases [73.53227696624306]
We present a new family of algorithms for learning Fourier-sparse set functions.
In contrast to other work that focused on the Walsh-Hadamard transform, our novel algorithms operate with recently introduced non-orthogonal Fourier transforms.
We demonstrate effectiveness on several real-world applications.
arXiv Detail & Related papers (2020-10-01T14:31:59Z) - Efficient Learning of Generative Models via Finite-Difference Score
Matching [111.55998083406134]
We present a generic strategy to efficiently approximate any-order directional derivative with finite difference.
Our approximation only involves function evaluations, which can be executed in parallel, and no gradient computations.
arXiv Detail & Related papers (2020-07-07T10:05:01Z) - From Sets to Multisets: Provable Variational Inference for Probabilistic
Integer Submodular Models [82.95892656532696]
Submodular functions have been studied extensively in machine learning and data mining.
In this work, we propose a continuous DR-submodular extension for integer submodular functions.
We formulate a new probabilistic model which is defined through integer submodular functions.
arXiv Detail & Related papers (2020-06-01T22:20:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.