Function Basis Encoding of Numerical Features in Factorization Machines
- URL: http://arxiv.org/abs/2305.14528v3
- Date: Thu, 02 Jan 2025 08:49:12 GMT
- Title: Function Basis Encoding of Numerical Features in Factorization Machines
- Authors: Alex Shtoff, Elie Abboud, Rotem Stram, Oren Somekh,
- Abstract summary: We provide a systematic and theoretically-justified way to incorporate numerical features into FM variants.
We show that our technique yields a model that learns segmentized functions of the numerical feature spanned by the set of functions of one's choice.
Our technique preserves fast training and inference, and requires only a small modification of the computational graph of an FM model.
- Score: 1.1053162109905017
- License:
- Abstract: Factorization machine (FM) variants are widely used for large scale real-time content recommendation systems, since they offer an excellent balance between model accuracy and low computational costs for training and inference. These systems are trained on tabular data with both numerical and categorical columns. Incorporating numerical columns poses a challenge, and they are typically incorporated using a scalar transformation or binning, which can be either learned or chosen a-priori. In this work, we provide a systematic and theoretically-justified way to incorporate numerical features into FM variants by encoding them into a vector of function values for a set of functions of one's choice. We view factorization machines as approximators of segmentized functions, namely, functions from a field's value to the real numbers, assuming the remaining fields are assigned some given constants, which we refer to as the segment. From this perspective, we show that our technique yields a model that learns segmentized functions of the numerical feature spanned by the set of functions of one's choice, namely, the spanning coefficients vary between segments. Hence, to improve model accuracy we advocate the use of functions known to have strong approximation power, and offer the B-Spline basis due to its well-known approximation power, availability in software libraries, and efficiency. Our technique preserves fast training and inference, and requires only a small modification of the computational graph of an FM model. Therefore, it is easy to incorporate into an existing system to improve its performance. Finally, we back our claims with a set of experiments, including synthetic, performance evaluation on several data-sets, and an A/B test on a real online advertising system which shows improved performance.
Related papers
- Fast and interpretable Support Vector Classification based on the truncated ANOVA decomposition [0.0]
Support Vector Machines (SVMs) are an important tool for performing classification on scattered data.
We propose solving SVMs in primal form using feature maps based on trigonometric functions or wavelets.
arXiv Detail & Related papers (2024-02-04T10:27:42Z) - Tuning Pre-trained Model via Moment Probing [62.445281364055795]
We propose a novel Moment Probing (MP) method to explore the potential of LP.
MP performs a linear classification head based on the mean of final features.
Our MP significantly outperforms LP and is competitive with counterparts at less training cost.
arXiv Detail & Related papers (2023-07-21T04:15:02Z) - FAStEN: An Efficient Adaptive Method for Feature Selection and Estimation in High-Dimensional Functional Regressions [7.674715791336311]
We propose a new, flexible and ultra-efficient approach to perform feature selection in a sparse function-on-function regression problem.
We show how to extend it to the scalar-on-function framework.
We present an application to brain fMRI data from the AOMIC PIOP1 study.
arXiv Detail & Related papers (2023-03-26T19:41:17Z) - A Functional approach for Two Way Dimension Reduction in Time Series [13.767812547998735]
We propose a non-linear function-on-function approach, which consists of a functional encoder and a functional decoder.
Our approach gives a low dimension latent representation by reducing the number of functional features as well as the timepoints at which the functions are observed.
arXiv Detail & Related papers (2023-01-01T06:09:15Z) - FRANS: Automatic Feature Extraction for Time Series Forecasting [2.3226893628361682]
We develop an autonomous Feature Retrieving Autoregressive Network for Static features that does not require domain knowledge.
Our results show that our features lead to improvement in accuracy in most situations.
arXiv Detail & Related papers (2022-09-15T03:14:59Z) - Learning Operators with Coupled Attention [9.715465024071333]
We propose a novel operator learning method, LOCA, motivated from the recent success of the attention mechanism.
In our architecture the input functions are mapped to a finite set of features which are then averaged with attention weights that depend on the output query locations.
By coupling these attention weights together with an integral transform, LOCA is able to explicitly learn correlations in the target output functions.
arXiv Detail & Related papers (2022-01-04T08:22:03Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - Feature Weighted Non-negative Matrix Factorization [92.45013716097753]
We propose the Feature weighted Non-negative Matrix Factorization (FNMF) in this paper.
FNMF learns the weights of features adaptively according to their importances.
It can be solved efficiently with the suggested optimization algorithm.
arXiv Detail & Related papers (2021-03-24T21:17:17Z) - Learning Set Functions that are Sparse in Non-Orthogonal Fourier Bases [73.53227696624306]
We present a new family of algorithms for learning Fourier-sparse set functions.
In contrast to other work that focused on the Walsh-Hadamard transform, our novel algorithms operate with recently introduced non-orthogonal Fourier transforms.
We demonstrate effectiveness on several real-world applications.
arXiv Detail & Related papers (2020-10-01T14:31:59Z) - Efficient Learning of Generative Models via Finite-Difference Score
Matching [111.55998083406134]
We present a generic strategy to efficiently approximate any-order directional derivative with finite difference.
Our approximation only involves function evaluations, which can be executed in parallel, and no gradient computations.
arXiv Detail & Related papers (2020-07-07T10:05:01Z) - From Sets to Multisets: Provable Variational Inference for Probabilistic
Integer Submodular Models [82.95892656532696]
Submodular functions have been studied extensively in machine learning and data mining.
In this work, we propose a continuous DR-submodular extension for integer submodular functions.
We formulate a new probabilistic model which is defined through integer submodular functions.
arXiv Detail & Related papers (2020-06-01T22:20:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.