Binary AddiVortes: (Bayesian) Additive Voronoi Tessellations for Binary Classification with an application to Predicting Home Mortgage Application Outcomes
- URL: http://arxiv.org/abs/2503.21792v1
- Date: Tue, 18 Mar 2025 12:59:02 GMT
- Title: Binary AddiVortes: (Bayesian) Additive Voronoi Tessellations for Binary Classification with an application to Predicting Home Mortgage Application Outcomes
- Authors: Adam J. Stone, Emmanuel Ogundimu, John Paul Gosling,
- Abstract summary: The Additive Voronoi Tessellations (AddiVortes) model is a multivariate regression model.<n>We extend the AddiVortes framework to binary classification by incorporating a probit model with a latent variable formulation.<n>A comprehensive analysis is conducted using AddiVortes to predict an individual's likelihood of being approved for a home mortgage.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Additive Voronoi Tessellations (AddiVortes) model is a multivariate regression model that uses multiple Voronoi tessellations to partition the covariate space for an additive ensemble model. In this paper, the AddiVortes framework is extended to binary classification by incorporating a probit model with a latent variable formulation. Specifically, we utilise a data augmentation technique, where a latent variable is introduced and the binary response is determined via thresholding. In most cases, the AddiVortes model outperforms random forests, BART and other leading black-box regression models when compared using a range of metrics. A comprehensive analysis is conducted using AddiVortes to predict an individual's likelihood of being approved for a home mortgage, based on a range of covariates. This evaluation highlights the model's effectiveness in capturing complex relationships within the data and its potential for improving decision-making in mortgage approval processes.
Related papers
- Co-data Learning for Bayesian Additive Regression Trees [0.0]
We propose to incorporate co-data into a sum-of-trees prediction model.
The proposed method can handle multiple types of co-data simultaneously.
Co-data enhances prediction in an application to diffuse large B-cell lymphoma prognosis.
arXiv Detail & Related papers (2023-11-16T16:14:39Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Bayesian Additive Main Effects and Multiplicative Interaction Models
using Tensor Regression for Multi-environmental Trials [0.0]
We propose a Bayesian tensor regression model to accommodate the effect of multiple factors on phenotype prediction.
We adopt a set of prior distributions that resolve identifiability issues that may arise between the parameters in the model.
We explore the applicability of our model by analysing real-world data related to wheat production across Ireland from 2010 to 2019.
arXiv Detail & Related papers (2023-01-09T19:54:50Z) - Bayesian Neural Network Inference via Implicit Models and the Posterior
Predictive Distribution [0.8122270502556371]
We propose a novel approach to perform approximate Bayesian inference in complex models such as Bayesian neural networks.
The approach is more scalable to large data than Markov Chain Monte Carlo.
We see this being useful in applications such as surrogate and physics-based models.
arXiv Detail & Related papers (2022-09-06T02:43:19Z) - On Uncertainty Estimation by Tree-based Surrogate Models in Sequential
Model-based Optimization [13.52611859628841]
We revisit various ensembles of randomized trees to investigate their behavior in the perspective of prediction uncertainty estimation.
We propose a new way of constructing an ensemble of randomized trees, referred to as BwO forest, where bagging with oversampling is employed to construct bootstrapped samples.
Experimental results demonstrate the validity and good performance of BwO forest over existing tree-based models in various circumstances.
arXiv Detail & Related papers (2022-02-22T04:50:37Z) - Bayesian Regression Approach for Building and Stacking Predictive Models
in Time Series Analytics [0.0]
The paper describes the use of Bayesian regression for building time series models and stacking different predictive models for time series.
It makes it possible to estimate an uncertainty of time series prediction and calculate value at risk characteristics.
arXiv Detail & Related papers (2022-01-06T12:58:23Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions.
We investigate methods for aggregating any number of conditional quantile models.
All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Two-step penalised logistic regression for multi-omic data with an
application to cardiometabolic syndrome [62.997667081978825]
We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately.
Our approach should be preferred if the goal is to select as many relevant predictors as possible.
Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level.
arXiv Detail & Related papers (2020-08-01T10:36:27Z) - Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions.
Motivated by these theoretical results, we propose learning several approximate proposals for the best model.
In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.