Additive Higher-Order Factorization Machines
- URL: http://arxiv.org/abs/2205.14515v1
- Date: Sat, 28 May 2022 19:50:52 GMT
- Title: Additive Higher-Order Factorization Machines
- Authors: David R\"ugamer
- Abstract summary: We derive a scalable high-order tensor product spline model using a factorization approach.
Our method allows to include all (higher-order) interactions of non-linear feature effects.
We prove both theoretically and empirically that our methods scales notably better than existing approaches.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the age of big data and interpretable machine learning, approaches need to
work at scale and at the same time allow for a clear mathematical understanding
of the method's inner workings. While there exist inherently interpretable
semi-parametric regression techniques for large-scale applications to account
for non-linearity in the data, their model complexity is still often
restricted. One of the main limitations are missing interactions in these
models, which are not included for the sake of better interpretability, but
also due to untenable computational costs. To address this shortcoming, we
derive a scalable high-order tensor product spline model using a factorization
approach. Our method allows to include all (higher-order) interactions of
non-linear feature effects while having computational costs proportional to a
model without interactions. We prove both theoretically and empirically that
our methods scales notably better than existing approaches, derive meaningful
penalization schemes and also discuss further theoretical aspects. We finally
investigate predictive and estimation performance both with synthetic and real
data.
Related papers
- Scalable Higher-Order Tensor Product Spline Models [0.0]
We propose a new approach using a factorization method to derive a highly scalable higher-order tensor product spline model.
Our method allows for the incorporation of all (higher-order) interactions of non-linear feature effects while having computational costs proportional to a model without interactions.
arXiv Detail & Related papers (2024-02-02T01:18:48Z) - Accelerating Generalized Linear Models by Trading off Computation for
Uncertainty [29.877181350448193]
Generalized Linear Models (GLMs) define a flexible probabilistic framework to model categorical, ordinal and continuous data.
The resulting approximation error adversely impacts the reliability of the model and is not accounted for in the uncertainty of the prediction.
We introduce a family of iterative methods that explicitly model this error.
arXiv Detail & Related papers (2023-10-31T08:58:16Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Variational Hierarchical Mixtures for Probabilistic Learning of Inverse
Dynamics [20.953728061894044]
Well-calibrated probabilistic regression models are a crucial learning component in robotics applications as datasets grow rapidly and tasks become more complex.
We consider a probabilistic hierarchical modeling paradigm that combines the benefits of both worlds to deliver computationally efficient representations with inherent complexity regularization.
We derive two efficient variational inference techniques to learn these representations and highlight the advantages of hierarchical infinite local regression models.
arXiv Detail & Related papers (2022-11-02T13:54:07Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - A Hypergradient Approach to Robust Regression without Correspondence [85.49775273716503]
We consider a variant of regression problem, where the correspondence between input and output data is not available.
Most existing methods are only applicable when the sample size is small.
We propose a new computational framework -- ROBOT -- for the shuffled regression problem.
arXiv Detail & Related papers (2020-11-30T21:47:38Z) - Non-parametric Models for Non-negative Functions [48.7576911714538]
We provide the first model for non-negative functions from the same good linear models.
We prove that it admits a representer theorem and provide an efficient dual formulation for convex problems.
arXiv Detail & Related papers (2020-07-08T07:17:28Z) - Causal Inference with Deep Causal Graphs [0.0]
Parametric causal modelling techniques rarely provide functionality for counterfactual estimation.
Deep Causal Graphs is an abstract specification of the required functionality for a neural network to model causal distributions.
We demonstrate its expressive power in modelling complex interactions and showcase applications to machine learning explainability and fairness.
arXiv Detail & Related papers (2020-06-15T13:03:33Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z) - A Semiparametric Approach to Interpretable Machine Learning [9.87381939016363]
Black box models in machine learning have demonstrated excellent predictive performance in complex problems and high-dimensional settings.
Their lack of transparency and interpretability restrict the applicability of such models in critical decision-making processes.
We propose a novel approach to trading off interpretability and performance in prediction models using ideas from semiparametric statistics.
arXiv Detail & Related papers (2020-06-08T16:38:15Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.