Related papers: Random Sampling High Dimensional Model Representation Gaussian Process Regression (RS-HDMR-GPR) for representing multidimensional functions with machine-learned lower-dimensional terms allowing insight with a general method

Random Sampling High Dimensional Model Representation Gaussian Process Regression (RS-HDMR-GPR) for representing multidimensional functions with machine-learned lower-dimensional terms allowing insight with a general method

URL: http://arxiv.org/abs/2012.02704v5
Date: Tue, 16 Nov 2021 09:47:50 GMT
Title: Random Sampling High Dimensional Model Representation Gaussian Process Regression (RS-HDMR-GPR) for representing multidimensional functions with machine-learned lower-dimensional terms allowing insight with a general method
Authors: Owen Ren, Mohamed Ali Boussaidi, Dmitry Voytsekhovsky, Manabu Ihara, and Sergei Manzhos
Abstract summary: Python implementation for RS-HDMR-GPR (Random Sampling High Dimensional Model Representation Gaussian Process Regression) Code allows for imputation of missing values of the variables and for a significant pruning of the useful number of HDMR terms. The capabilities of this regression tool are demonstrated on test cases involving synthetic analytic functions, the potential energy surface of the water molecule, kinetic energy densities of materials, and financial market data.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We present a Python implementation for RS-HDMR-GPR (Random Sampling High Dimensional Model Representation Gaussian Process Regression). The method builds representations of multivariate functions with lower-dimensional terms, either as an expansion over orders of coupling or using terms of only a given dimensionality. This facilitates, in particular, recovering functional dependence from sparse data. The code also allows for imputation of missing values of the variables and for a significant pruning of the useful number of HDMR terms. The code can also be used for estimating relative importance of different combinations of input variables, thereby adding an element of insight to a general machine learning method. The capabilities of this regression tool are demonstrated on test cases involving synthetic analytic functions, the potential energy surface of the water molecule, kinetic energy densities of materials (crystalline magnesium, aluminum, and silicon), and financial market data.

Related papers

Combining additivity and active subspaces for high-dimensional Gaussian process modeling [2.7140711924836816]
We show how to combine high-dimensional Gaussian process modeling with a multi-fidelity strategy. Our contribution for high-dimensional Gaussian process modeling is to combine them with a multi-fidelity strategy, showcasing the advantages through experiments on synthetic functions and datasets.
arXiv Detail & Related papers (2024-02-06T08:49:27Z)
Deep Generative Symbolic Regression [83.04219479605801]
Symbolic regression aims to discover concise closed-form mathematical equations from data. Existing methods, ranging from search to reinforcement learning, fail to scale with the number of input variables. We propose an instantiation of our framework, Deep Generative Symbolic Regression.
arXiv Detail & Related papers (2023-12-30T17:05:31Z)
Discovering Interpretable Physical Models using Symbolic Regression and Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models. DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems. We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z)
Heterogeneous Multi-Task Gaussian Cox Processes [61.67344039414193]
We present a novel extension of multi-task Gaussian Cox processes for modeling heterogeneous correlated tasks jointly. A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks. We derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters.
arXiv Detail & Related papers (2023-08-29T15:01:01Z)
Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions. Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation. In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z)
Inverting brain grey matter models with likelihood-free inference: a tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI. We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells. We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z)
Scalable Gaussian Processes for Data-Driven Design using Big Data with Categorical Factors [14.337297795182181]
Gaussian processes (GP) have difficulties in accommodating big datasets, categorical inputs, and multiple responses. We propose a GP model that utilizes latent variables and functions obtained through variational inference to address the aforementioned challenges simultaneously. Our approach is demonstrated for machine learning of ternary oxide materials and topology optimization of a multiscale compliant mechanism.
arXiv Detail & Related papers (2021-06-26T02:17:23Z)
Gaussian Function On Response Surface Estimation [12.35564140065216]
We propose a new framework for interpreting (features and samples) black-box machine learning models via a metamodeling technique. The metamodel can be estimated from data generated via a trained complex model by running the computer experiment on samples of data in the region of interest.
arXiv Detail & Related papers (2021-01-04T04:47:00Z)
Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses. Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets. We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z)
Deep Learning with Functional Inputs [0.0]
We present a methodology for integrating functional data into feed-forward neural networks. A by-product of the method is a set of dynamic functional weights that can be visualized during the optimization process. The model is shown to perform well in a number of contexts including prediction of new data and recovery of the true underlying functional weights.
arXiv Detail & Related papers (2020-06-17T01:23:00Z)
Projection Pursuit Gaussian Process Regression [5.837881923712394]
A primary goal of computer experiments is to reconstruct the function given by the computer code via scattered evaluations. Traditional isotropic Gaussian process models suffer from the curse of dimensionality, when the input dimension is relatively high given limited data points. We consider a projection pursuit model, in which the nonparametric part is driven by an additive Gaussian process regression.
arXiv Detail & Related papers (2020-04-01T19:12:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.