Detection of Interacting Variables for Generalized Linear Models via
Neural Networks
- URL: http://arxiv.org/abs/2209.08030v2
- Date: Sun, 21 May 2023 12:10:54 GMT
- Title: Detection of Interacting Variables for Generalized Linear Models via
Neural Networks
- Authors: Yevhen Havrylenko and Julia Heger
- Abstract summary: We present an approach to automating the process of finding interactions that should be added to generalized linear models (GLMs)
Our approach relies on neural networks and a model-specific interaction detection method, which is computationally faster than the traditionally used methods like Friedman H-Statistic or SHAP values.
In numerical studies, we provide the results of our approach on artificially generated data as well as open-source data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The quality of generalized linear models (GLMs), frequently used by insurance
companies, depends on the choice of interacting variables. The search for
interactions is time-consuming, especially for data sets with a large number of
variables, depends much on expert judgement of actuaries, and often relies on
visual performance indicators. Therefore, we present an approach to automating
the process of finding interactions that should be added to GLMs to improve
their predictive power. Our approach relies on neural networks and a
model-specific interaction detection method, which is computationally faster
than the traditionally used methods like Friedman H-Statistic or SHAP values.
In numerical studies, we provide the results of our approach on artificially
generated data as well as open-source data.
Related papers
- Targeted Cause Discovery with Data-Driven Learning [66.86881771339145]
We propose a novel machine learning approach for inferring causal variables of a target variable from observations.
We employ a neural network trained to identify causality through supervised learning on simulated data.
Empirical results demonstrate the effectiveness of our method in identifying causal relationships within large-scale gene regulatory networks.
arXiv Detail & Related papers (2024-08-29T02:21:11Z) - Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - Neural networks for insurance pricing with frequency and severity data: a benchmark study from data preprocessing to technical tariff [2.4578723416255754]
We present a benchmark study on four insurance data sets with frequency and severity targets in the presence of multiple types of input features.
We compare in detail the performance of a generalized linear model on binned input data, a gradient-boosted tree model, a feed-forward neural network (FFNN), and the combined actuarial neural network (CANN)
arXiv Detail & Related papers (2023-10-19T12:00:33Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them.
We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z) - Bayesian Active Learning for Discrete Latent Variable Models [19.852463786440122]
Active learning seeks to reduce the amount of data required to fit the parameters of a model.
latent variable models play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines.
arXiv Detail & Related papers (2022-02-27T19:07:12Z) - Generalized Matrix Factorization: efficient algorithms for fitting
generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses.
Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets.
We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z) - Visual Neural Decomposition to Explain Multivariate Data Sets [13.117139248511783]
Investigating relationships between variables in multi-dimensional data sets is a common task for data analysts and engineers.
We propose a novel approach to visualize correlations between input variables and a target output variable that scales to hundreds of variables.
arXiv Detail & Related papers (2020-09-11T15:53:37Z) - Connecting the Dots: Multivariate Time Series Forecasting with Graph
Neural Networks [91.65637773358347]
We propose a general graph neural network framework designed specifically for multivariate time series data.
Our approach automatically extracts the uni-directed relations among variables through a graph learning module.
Our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets.
arXiv Detail & Related papers (2020-05-24T04:02:18Z) - Modeling Rare Interactions in Time Series Data Through Qualitative
Change: Application to Outcome Prediction in Intensive Care Units [1.0349800230036503]
We present a model for uncovering interactions with the highest likelihood of generating the outcomes seen from highly-dimensional time series data.
Using the assumption that similar templates of small interactions are responsible for the outcomes, we reformulate the discovery task to retrieve the most-likely templates from the data.
arXiv Detail & Related papers (2020-04-03T08:49:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.