Random forests for binary geospatial data
- URL: http://arxiv.org/abs/2302.13828v1
- Date: Mon, 27 Feb 2023 14:34:33 GMT
- Title: Random forests for binary geospatial data
- Authors: Arkajyoti Saha and Abhirup Datta
- Abstract summary: We propose RF-GP, using Random Forests for estimating the non-linear covariate effect and Gaussian Processes for modeling the spatial random effects.
RF-GP outperforms existing RF methods for estimation and prediction in both simulated and real-world data.
We establish consistency of RF-GP for a general class of $beta$-mixing binary processes that includes common choices like spatial Mat'ern GP and autoregressive processes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binary geospatial data is commonly analyzed with generalized linear mixed
models, specified with a linear fixed covariate effect and a Gaussian Process
(GP)-distributed spatial random effect, relating to the response via a link
function. The assumption of linear covariate effects is severely restrictive.
Random Forests (RF) are increasingly being used for non-linear modeling of
spatial data, but current extensions of RF for binary spatial data depart the
mixed model setup, relinquishing inference on the fixed effects and other
advantages of using GP. We propose RF-GP, using Random Forests for estimating
the non-linear covariate effect and Gaussian Processes for modeling the spatial
random effects directly within the generalized mixed model framework. We
observe and exploit equivalence of Gini impurity measure and least squares loss
to propose an extension of RF for binary data that accounts for the spatial
dependence. We then propose a novel link inversion algorithm that leverages the
properties of GP to estimate the covariate effects and offer spatial
predictions. RF-GP outperforms existing RF methods for estimation and
prediction in both simulated and real-world data. We establish consistency of
RF-GP for a general class of $\beta$-mixing binary processes that includes
common choices like spatial Mat\'ern GP and autoregressive processes.
Related papers
- Learning sparse generalized linear models with binary outcomes via iterative hard thresholding [15.283757486793226]
In statistics, generalized linear models (GLMs) are widely used for modeling data.
In this work, we propose to use and analyze an iterative hard thresholding (projected gradient descent on the ReLU loss) algorithm, called binary iterative hard thresholding (BIHT)
We establish that BIHT is statistically efficient and converges to the correct solution for parameter estimation in a general class of sparse binary GLMs.
arXiv Detail & Related papers (2025-02-25T17:42:33Z) - Robust Gaussian Processes via Relevance Pursuit [17.39376866275623]
We propose and study a GP model that achieves robustness against sparse outliers by inferring data-point-specific noise levels.
We show, surprisingly, that the model can be parameterized such that the associated log marginal likelihood is strongly concave in the data-point-specific noise variances.
arXiv Detail & Related papers (2024-10-31T17:59:56Z) - On the Wasserstein Convergence and Straightness of Rectified Flow [54.580605276017096]
Rectified Flow (RF) is a generative model that aims to learn straight flow trajectories from noise to data.
We provide a theoretical analysis of the Wasserstein distance between the sampling distribution of RF and the target distribution.
We present general conditions guaranteeing uniqueness and straightness of 1-RF, which is in line with previous empirical findings.
arXiv Detail & Related papers (2024-10-19T02:36:11Z) - Sparse Variational Contaminated Noise Gaussian Process Regression with Applications in Geomagnetic Perturbations Forecasting [4.675221539472143]
We propose a scalable inference algorithm for fitting sparse Gaussian process regression models with contaminated normal noise on large datasets.
We show that our approach yields shorter prediction intervals for similar coverage and accuracy when compared to an artificial dense neural network baseline.
arXiv Detail & Related papers (2024-02-27T15:08:57Z) - Joint Bayesian Inference of Graphical Structure and Parameters with a
Single Generative Flow Network [59.79008107609297]
We propose in this paper to approximate the joint posterior over the structure of a Bayesian Network.
We use a single GFlowNet whose sampling policy follows a two-phase process.
Since the parameters are included in the posterior distribution, this leaves more flexibility for the local probability models.
arXiv Detail & Related papers (2023-05-30T19:16:44Z) - Neural networks for geospatial data [0.0]
NN-GLS is a new neural network estimation algorithm for the non-linear mean in GP models.
We show that NN-GLS admits a representation as a special type of graph neural network (GNN)
Theoretically, we show that NN-GLS will be consistent for irregularly observed spatially correlated data processes.
arXiv Detail & Related papers (2023-04-18T17:52:23Z) - ManiFlow: Implicitly Representing Manifolds with Normalizing Flows [145.9820993054072]
Normalizing Flows (NFs) are flexible explicit generative models that have been shown to accurately model complex real-world data distributions.
We propose an optimization objective that recovers the most likely point on the manifold given a sample from the perturbed distribution.
Finally, we focus on 3D point clouds for which we utilize the explicit nature of NFs, i.e. surface normals extracted from the gradient of the log-likelihood and the log-likelihood itself.
arXiv Detail & Related papers (2022-08-18T16:07:59Z) - Gaussian Graphical Models as an Ensemble Method for Distributed Gaussian
Processes [8.4159776055506]
We propose a novel approach for aggregating the Gaussian experts' predictions by Gaussian graphical model (GGM)
We first estimate the joint distribution of latent and observed variables using the Expectation-Maximization (EM) algorithm.
Our new method outperforms other state-of-the-art DGP approaches.
arXiv Detail & Related papers (2022-02-07T15:22:56Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system.
In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX)
The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z) - Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z) - Random Forests for dependent data [1.5469452301122173]
We propose RF-GLS, a novel extension of RF for dependent error processes.
The key to this extension is the equivalent representation of the local decision-making in a regression tree as a global OLS optimization.
We empirically demonstrate the improvement achieved by RF-GLS over RF for both estimation and prediction under dependence.
arXiv Detail & Related papers (2020-07-30T12:36:09Z) - Deep Gaussian Markov Random Fields [17.31058900857327]
We establish a formal connection between GMRFs and convolutional neural networks (CNNs)
Common GMRFs are special cases of a generative model where the inverse mapping from data to latent variables is given by a 1-layer linear CNN.
We describe how well-established tools, such as autodiff and variational inference, can be used for simple and efficient inference and learning of the deep GMRF.
arXiv Detail & Related papers (2020-02-18T10:06:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.