A tree-based varying coefficient model
- URL: http://arxiv.org/abs/2401.05982v3
- Date: Mon, 15 Jan 2024 15:15:34 GMT
- Title: A tree-based varying coefficient model
- Authors: Henning Zakrisson and Mathias Lindholm
- Abstract summary: The paper introduces a tree-based varying coefficient model (VCM) where the varying coefficients are modelled using the cyclic gradient boosting machine (CGBM)
The dimension-wise early stopping not only reduces the risk of dimension-specific overfitting, but also reveals differences in model complexity across dimensions.
The model is evaluated on the same simulated and real data examples as those used in Richman and W"uthrich (2023), and the results show that it produces results in terms of out of sample loss that are comparable to those of their neural network-based VCM called LocalGLMnet.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The paper introduces a tree-based varying coefficient model (VCM) where the
varying coefficients are modelled using the cyclic gradient boosting machine
(CGBM) from Delong et al. (2023). Modelling the coefficient functions using a
CGBM allows for dimension-wise early stopping and feature importance scores.
The dimension-wise early stopping not only reduces the risk of
dimension-specific overfitting, but also reveals differences in model
complexity across dimensions. The use of feature importance scores allows for
simple feature selection and easy model interpretation. The model is evaluated
on the same simulated and real data examples as those used in Richman and
W\"uthrich (2023), and the results show that it produces results in terms of
out of sample loss that are comparable to those of their neural network-based
VCM called LocalGLMnet.
Related papers
- Linear Mode Connectivity in Differentiable Tree Ensembles [13.704584231053675]
Linear Mode Connectivity (LMC) refers to the phenomenon that remains consistent for linearly interpolated models in the parameter space.
We first achieve LMC for soft tree neurons, which are tree-based differentiable models extensively used in practice.
arXiv Detail & Related papers (2024-05-23T14:11:26Z) - Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Accurate deep learning sub-grid scale models for large eddy simulations [0.0]
We present two families of sub-grid scale (SGS) turbulence models developed for large-eddy simulation (LES) purposes.
Their development required the formulation of physics-informed robust and efficient Deep Learning (DL) algorithms.
Explicit filtering of data from direct simulations of canonical channel flow at two friction Reynolds numbers provided accurate data for training and testing.
arXiv Detail & Related papers (2023-07-19T15:30:06Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - MoEfication: Conditional Computation of Transformer Models for Efficient
Inference [66.56994436947441]
Transformer-based pre-trained language models can achieve superior performance on most NLP tasks due to large parameter capacity, but also lead to huge computation cost.
We explore to accelerate large-model inference by conditional computation based on the sparse activation phenomenon.
We propose to transform a large model into its mixture-of-experts (MoE) version with equal model size, namely MoEfication.
arXiv Detail & Related papers (2021-10-05T02:14:38Z) - A Data-driven feature selection and machine-learning model benchmark for
the prediction of longitudinal dispersion coefficient [29.58577229101903]
An accurate prediction on Longitudinal Dispersion(LD) coefficient can produce a performance leap in related simulation.
In this study, a global optimal feature set was proposed through numerical comparison of the distilled local optimums in performance with representative ML models.
Results show that the support vector machine has significantly better performance than other models.
arXiv Detail & Related papers (2021-07-16T09:50:38Z) - T-LoHo: A Bayesian Regularization Model for Structured Sparsity and
Smoothness on Graphs [0.0]
In graph-structured data, structured sparsity and smoothness tend to cluster together.
We propose a new prior for high dimensional parameters with graphical relations.
We use it to detect structured sparsity and smoothness simultaneously.
arXiv Detail & Related papers (2021-07-06T10:10:03Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z) - Shaping Deep Feature Space towards Gaussian Mixture for Visual
Classification [74.48695037007306]
We propose a Gaussian mixture (GM) loss function for deep neural networks for visual classification.
With a classification margin and a likelihood regularization, the GM loss facilitates both high classification performance and accurate modeling of the feature distribution.
The proposed model can be implemented easily and efficiently without using extra trainable parameters.
arXiv Detail & Related papers (2020-11-18T03:32:27Z) - Supervised Autoencoders Learn Robust Joint Factor Models of Neural
Activity [2.8402080392117752]
neuroscience applications collect high-dimensional predictors' corresponding to brain activity in different regions along with behavioral outcomes.
Joint factor models for the predictors and outcomes are natural, but maximum likelihood estimates of these models can struggle in practice when there is model misspecification.
We propose an alternative inference strategy based on supervised autoencoders; rather than placing a probability distribution on the latent factors, we define them as an unknown function of the high-dimensional predictors.
arXiv Detail & Related papers (2020-04-10T19:31:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.