Regression Trees Know Calculus
- URL: http://arxiv.org/abs/2405.13846v3
- Date: Fri, 31 Jan 2025 14:52:47 GMT
- Title: Regression Trees Know Calculus
- Authors: Nathan Wycoff,
- Abstract summary: We find a simple estimate of the gradient which can be efficiently computed using quantities exposed by popular tree learning libraries.
This allows the tools developed in the context of differentiable algorithms, like neural nets and Gaussian processes, to be deployed to tree-based models.
- Score: 1.8130068086063336
- License:
- Abstract: Regression trees have emerged as a preeminent tool for solving real-world regression problems due to their ability to deal with nonlinearities, interaction effects and sharp discontinuities. In this article, we rather study regression trees applied to well-behaved, differentiable functions, and determine the relationship between node parameters and the local gradient of the function being approximated. We find a simple estimate of the gradient which can be efficiently computed using quantities exposed by popular tree learning libraries. This allows the tools developed in the context of differentiable algorithms, like neural nets and Gaussian processes, to be deployed to tree-based models. To demonstrate this, we study measures of model sensitivity defined in terms of integrals of gradients and demonstrate how to compute them for regression trees using the proposed gradient estimates. Quantitative and qualitative numerical experiments reveal the capability of gradients estimated by regression trees to improve predictive analysis, solve tasks in uncertainty quantification, and provide interpretation of model behavior.
Related papers
- RieszBoost: Gradient Boosting for Riesz Regression [49.737777802061984]
We propose a novel gradient boosting algorithm to directly estimate the Riesz representer without requiring its explicit analytical form.
We show that our algorithm performs on par with or better than indirect estimation techniques across a range of functionals.
arXiv Detail & Related papers (2025-01-08T23:04:32Z) - Reliable and scalable variable importance estimation via warm-start and early stopping [8.870380386952993]
We develop a scalable method to estimate variable importance for any algorithm that can be expressed as an iterative kernel update equation.
We provide theoretical guarantees by using the theory for early stopping of kernel-based methods for neural networks with sufficiently large (but not necessarily infinite) width and gradient-boosting decision trees.
arXiv Detail & Related papers (2024-12-02T04:45:10Z) - Model-Based Reparameterization Policy Gradient Methods: Theory and
Practical Algorithms [88.74308282658133]
Reization (RP) Policy Gradient Methods (PGMs) have been widely adopted for continuous control tasks in robotics and computer graphics.
Recent studies have revealed that, when applied to long-term reinforcement learning problems, model-based RP PGMs may experience chaotic and non-smooth optimization landscapes.
We propose a spectral normalization method to mitigate the exploding variance issue caused by long model unrolls.
arXiv Detail & Related papers (2023-10-30T18:43:21Z) - Generalizing Backpropagation for Gradient-Based Interpretability [103.2998254573497]
We show that the gradient of a model is a special case of a more general formulation using semirings.
This observation allows us to generalize the backpropagation algorithm to efficiently compute other interpretable statistics.
arXiv Detail & Related papers (2023-07-06T15:19:53Z) - Learning to Simulate Tree-Branch Dynamics for Manipulation [26.808346972775368]
We propose to use a simulation driven inverse inference approach to model the dynamics of tree branches under manipulation.
We show that our model can predict deformation trajectories, quantify the estimation uncertainty, and it can perform better when base-lined against other inference algorithms.
arXiv Detail & Related papers (2023-06-06T05:17:02Z) - Adaptive LASSO estimation for functional hidden dynamic geostatistical
model [69.10717733870575]
We propose a novel model selection algorithm based on a penalized maximum likelihood estimator (PMLE) for functional hiddenstatistical models (f-HD)
The algorithm is based on iterative optimisation and uses an adaptive least absolute shrinkage and selector operator (GMSOLAS) penalty function, wherein the weights are obtained by the unpenalised f-HD maximum-likelihood estimators.
arXiv Detail & Related papers (2022-08-10T19:17:45Z) - A cautionary tale on fitting decision trees to data from additive
models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models.
This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data.
We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z) - Uncertainty in Gradient Boosting via Ensembles [37.808845398471874]
ensembles of gradient boosting models successfully detect anomalous inputs while having limited ability to improve the predicted total uncertainty.
We propose a concept of a virtual ensemble to get the benefits of an ensemble via only one gradient boosting model, which significantly reduces complexity.
arXiv Detail & Related papers (2020-06-18T14:11:27Z) - Multivariate Boosted Trees and Applications to Forecasting and Control [0.0]
Gradient boosted trees are non-parametric regressors that exploit sequential model fitting and gradient descent to minimize a specific loss function.
In this paper, we present a computationally efficient algorithm for fitting multivariate boosted trees.
arXiv Detail & Related papers (2020-03-08T19:26:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.