Related papers: Detection of Interacting Variables for Generalized Linear Models via Neural Networks

Detection of Interacting Variables for Generalized Linear Models via Neural Networks

URL: http://arxiv.org/abs/2209.08030v2
Date: Sun, 21 May 2023 12:10:54 GMT
Title: Detection of Interacting Variables for Generalized Linear Models via Neural Networks
Authors: Yevhen Havrylenko and Julia Heger
Abstract summary: We present an approach to automating the process of finding interactions that should be added to generalized linear models (GLMs) Our approach relies on neural networks and a model-specific interaction detection method, which is computationally faster than the traditionally used methods like Friedman H-Statistic or SHAP values. In numerical studies, we provide the results of our approach on artificially generated data as well as open-source data.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The quality of generalized linear models (GLMs), frequently used by insurance companies, depends on the choice of interacting variables. The search for interactions is time-consuming, especially for data sets with a large number of variables, depends much on expert judgement of actuaries, and often relies on visual performance indicators. Therefore, we present an approach to automating the process of finding interactions that should be added to GLMs to improve their predictive power. Our approach relies on neural networks and a model-specific interaction detection method, which is computationally faster than the traditionally used methods like Friedman H-Statistic or SHAP values. In numerical studies, we provide the results of our approach on artificially generated data as well as open-source data.

Related papers

Targeted Cause Discovery with Data-Driven Learning [66.86881771339145]
We propose a novel machine learning approach for inferring causal variables of a target variable from observations. We employ a neural network trained to identify causality through supervised learning on simulated data. Empirical results demonstrate the effectiveness of our method in identifying causal relationships within large-scale gene regulatory networks.
arXiv Detail & Related papers (2024-08-29T02:21:11Z)
Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data. We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z)
Neural networks for insurance pricing with frequency and severity data: a benchmark study from data preprocessing to technical tariff [2.4578723416255754]
We present a benchmark study on four insurance data sets with frequency and severity targets in the presence of multiple types of input features. We compare in detail the performance of a generalized linear model on binned input data, a gradient-boosted tree model, a feed-forward neural network (FFNN), and the combined actuarial neural network (CANN)
arXiv Detail & Related papers (2023-10-19T12:00:33Z)
Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data. Main aim of the identified model is to predict new data from previous observations. We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z)
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models. We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z)
Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them. We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z)
Bayesian Active Learning for Discrete Latent Variable Models [19.852463786440122]
Active learning seeks to reduce the amount of data required to fit the parameters of a model. latent variable models play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines.
arXiv Detail & Related papers (2022-02-27T19:07:12Z)
Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses. Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets. We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z)
Visual Neural Decomposition to Explain Multivariate Data Sets [13.117139248511783]
Investigating relationships between variables in multi-dimensional data sets is a common task for data analysts and engineers. We propose a novel approach to visualize correlations between input variables and a target output variable that scales to hundreds of variables.
arXiv Detail & Related papers (2020-09-11T15:53:37Z)
Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks [91.65637773358347]
We propose a general graph neural network framework designed specifically for multivariate time series data. Our approach automatically extracts the uni-directed relations among variables through a graph learning module. Our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets.
arXiv Detail & Related papers (2020-05-24T04:02:18Z)
Modeling Rare Interactions in Time Series Data Through Qualitative Change: Application to Outcome Prediction in Intensive Care Units [1.0349800230036503]
We present a model for uncovering interactions with the highest likelihood of generating the outcomes seen from highly-dimensional time series data. Using the assumption that similar templates of small interactions are responsible for the outcomes, we reformulate the discovery task to retrieve the most-likely templates from the data.
arXiv Detail & Related papers (2020-04-03T08:49:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.