Random forests for binary geospatial data
- URL: http://arxiv.org/abs/2302.13828v1
- Date: Mon, 27 Feb 2023 14:34:33 GMT
- Title: Random forests for binary geospatial data
- Authors: Arkajyoti Saha and Abhirup Datta
- Abstract summary: We propose RF-GP, using Random Forests for estimating the non-linear covariate effect and Gaussian Processes for modeling the spatial random effects.
RF-GP outperforms existing RF methods for estimation and prediction in both simulated and real-world data.
We establish consistency of RF-GP for a general class of $beta$-mixing binary processes that includes common choices like spatial Mat'ern GP and autoregressive processes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binary geospatial data is commonly analyzed with generalized linear mixed
models, specified with a linear fixed covariate effect and a Gaussian Process
(GP)-distributed spatial random effect, relating to the response via a link
function. The assumption of linear covariate effects is severely restrictive.
Random Forests (RF) are increasingly being used for non-linear modeling of
spatial data, but current extensions of RF for binary spatial data depart the
mixed model setup, relinquishing inference on the fixed effects and other
advantages of using GP. We propose RF-GP, using Random Forests for estimating
the non-linear covariate effect and Gaussian Processes for modeling the spatial
random effects directly within the generalized mixed model framework. We
observe and exploit equivalence of Gini impurity measure and least squares loss
to propose an extension of RF for binary data that accounts for the spatial
dependence. We then propose a novel link inversion algorithm that leverages the
properties of GP to estimate the covariate effects and offer spatial
predictions. RF-GP outperforms existing RF methods for estimation and
prediction in both simulated and real-world data. We establish consistency of
RF-GP for a general class of $\beta$-mixing binary processes that includes
common choices like spatial Mat\'ern GP and autoregressive processes.
Related papers
- Robust Gaussian Processes via Relevance Pursuit [17.39376866275623]
We propose and study a GP model that achieves robustness against sparse outliers by inferring data-point-specific noise levels.
We show, surprisingly, that the model can be parameterized such that the associated log marginal likelihood is strongly concave in the data-point-specific noise variances.
arXiv Detail & Related papers (2024-10-31T17:59:56Z) - Sparse Variational Contaminated Noise Gaussian Process Regression with Applications in Geomagnetic Perturbations Forecasting [4.675221539472143]
We propose a scalable inference algorithm for fitting sparse Gaussian process regression models with contaminated normal noise on large datasets.
We show that our approach yields shorter prediction intervals for similar coverage and accuracy when compared to an artificial dense neural network baseline.
arXiv Detail & Related papers (2024-02-27T15:08:57Z) - Joint Bayesian Inference of Graphical Structure and Parameters with a
Single Generative Flow Network [59.79008107609297]
We propose in this paper to approximate the joint posterior over the structure of a Bayesian Network.
We use a single GFlowNet whose sampling policy follows a two-phase process.
Since the parameters are included in the posterior distribution, this leaves more flexibility for the local probability models.
arXiv Detail & Related papers (2023-05-30T19:16:44Z) - Neural networks for geospatial data [0.0]
NN-GLS is a new neural network estimation algorithm for the non-linear mean in GP models.
We show that NN-GLS admits a representation as a special type of graph neural network (GNN)
Theoretically, we show that NN-GLS will be consistent for irregularly observed spatially correlated data processes.
arXiv Detail & Related papers (2023-04-18T17:52:23Z) - ManiFlow: Implicitly Representing Manifolds with Normalizing Flows [145.9820993054072]
Normalizing Flows (NFs) are flexible explicit generative models that have been shown to accurately model complex real-world data distributions.
We propose an optimization objective that recovers the most likely point on the manifold given a sample from the perturbed distribution.
Finally, we focus on 3D point clouds for which we utilize the explicit nature of NFs, i.e. surface normals extracted from the gradient of the log-likelihood and the log-likelihood itself.
arXiv Detail & Related papers (2022-08-18T16:07:59Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system.
In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX)
The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z) - Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z) - Random Forests for dependent data [1.5469452301122173]
We propose RF-GLS, a novel extension of RF for dependent error processes.
The key to this extension is the equivalent representation of the local decision-making in a regression tree as a global OLS optimization.
We empirically demonstrate the improvement achieved by RF-GLS over RF for both estimation and prediction under dependence.
arXiv Detail & Related papers (2020-07-30T12:36:09Z) - Deep Gaussian Markov Random Fields [17.31058900857327]
We establish a formal connection between GMRFs and convolutional neural networks (CNNs)
Common GMRFs are special cases of a generative model where the inverse mapping from data to latent variables is given by a 1-layer linear CNN.
We describe how well-established tools, such as autodiff and variational inference, can be used for simple and efficient inference and learning of the deep GMRF.
arXiv Detail & Related papers (2020-02-18T10:06:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.