Accelerating Generalized Random Forests with Fixed-Point Trees
- URL: http://arxiv.org/abs/2306.11908v1
- Date: Tue, 20 Jun 2023 21:45:35 GMT
- Title: Accelerating Generalized Random Forests with Fixed-Point Trees
- Authors: David Fleischer and David A. Stephens and Archer Yang
- Abstract summary: Estimators are constructed by leveraging random forests as an adaptive kernel weighting algorithm.
We propose a new tree-growing rule for generalized random forests induced from a fixed-point iteration type of approximation.
- Score: 2.810283834703862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalized random forests arXiv:1610.01271 build upon the well-established
success of conventional forests (Breiman, 2001) to offer a flexible and
powerful non-parametric method for estimating local solutions of heterogeneous
estimating equations. Estimators are constructed by leveraging random forests
as an adaptive kernel weighting algorithm and implemented through a
gradient-based tree-growing procedure. By expressing this gradient-based
approximation as being induced from a single Newton-Raphson root-finding
iteration, and drawing upon the connection between estimating equations and
fixed-point problems arXiv:2110.11074, we propose a new tree-growing rule for
generalized random forests induced from a fixed-point iteration type of
approximation, enabling gradient-free optimization, and yielding substantial
time savings for tasks involving even modest dimensionality of the target
quantity (e.g. multiple/multi-level treatment effects). We develop an
asymptotic theory for estimators obtained from forests whose trees are grown
through the fixed-point splitting rule, and provide numerical simulations
demonstrating that the estimators obtained from such forests are comparable to
those obtained from the more costly gradient-based rule.
Related papers
- Enhanced Derivative-Free Optimization Using Adaptive Correlation-Induced Finite Difference Estimators [6.054123928890574]
We develop an algorithm designed to enhance DFO in terms of both gradient estimation efficiency and sample efficiency.
We establish the consistency of our proposed algorithm and demonstrate that, despite using a batch of samples per iteration, it achieves the same convergence rate as the KW and SPSA methods.
arXiv Detail & Related papers (2025-02-28T08:05:54Z) - Tighter sparse variational Gaussian processes [22.290236192353316]
Sparse variational Gaussian process (GP) approximations have become the de facto standard for scaling GPs to large datasets.
This paper introduces a provably tighter variational approximation by relaxing the standard assumption that the conditional approximate posterior given the inducing points must match that in the prior.
arXiv Detail & Related papers (2025-02-07T08:33:28Z) - HNCI: High-Dimensional Network Causal Inference [4.024850952459758]
We suggest a new method of high-dimensional network causal inference (HNCI) that provides both valid confidence interval on the average direct treatment effect on the treated (ADET) and valid confidence set for the neighborhood size for interference effect.
arXiv Detail & Related papers (2024-12-24T17:41:41Z) - Likelihood approximations via Gaussian approximate inference [3.4991031406102238]
We propose efficient schemes to approximate the effects of non-Gaussian likelihoods by Gaussian densities.
Our results attain good approximation quality for binary and multiclass classification in large-scale point-estimate and distributional inferential settings.
As a by-product, we show that the proposed approximate log-likelihoods are a superior alternative to least-squares on raw labels for neural network classification.
arXiv Detail & Related papers (2024-10-28T05:39:26Z) - Learning Deep Tree-based Retriever for Efficient Recommendation: Theory and Method [76.31185707649227]
We propose a Deep Tree-based Retriever (DTR) for efficient recommendation.
DTR frames the training task as a softmax-based multi-class classification over tree nodes at the same level.
To mitigate the suboptimality induced by the labeling of non-leaf nodes, we propose a rectification method for the loss function.
arXiv Detail & Related papers (2024-08-21T05:09:53Z) - Statistical Advantages of Oblique Randomized Decision Trees and Forests [0.0]
Generalization error and convergence rates are obtained for the flexible dimension reduction model class of ridge functions.
A lower bound on the risk of axis-aligned Mondrian trees is obtained proving that these estimators are suboptimal for these linear dimension reduction models.
arXiv Detail & Related papers (2024-07-02T17:35:22Z) - Ensembles of Probabilistic Regression Trees [46.53457774230618]
Tree-based ensemble methods have been successfully used for regression problems in many applications and research studies.
We study ensemble versions of probabilisticregression trees that provide smooth approximations of the objective function by assigningeach observation to each region with respect to a probability distribution.
arXiv Detail & Related papers (2024-06-20T06:51:51Z) - Adaptive Split Balancing for Optimal Random Forest [8.916614661563893]
We propose a new random forest algorithm that constructs the trees using a novel adaptive split-balancing method.
Our method achieves optimality in simple, smooth scenarios while adaptively learning the tree structure from the data.
arXiv Detail & Related papers (2024-02-17T09:10:40Z) - Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT)
We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z) - Random forests for binary geospatial data [0.0]
Existing implementations of random forests for binary data cannot explicitly account for data correlation common in geospatial and time-series settings.
Recent work has extended random forests (RF) to RF-GLS that incorporate spatial covariance using the generalized least squares (GLS) loss.
We show that for binary data, the GLS loss is also an extension of the Gini impurity measure, as the latter is exactly equivalent to the ordinary least squares (OLS) loss.
We propose a novel link-inversion technique that embeds the RF-GLS estimate of the mean function from the first step within the generalized
arXiv Detail & Related papers (2023-02-27T14:34:33Z) - On the Pointwise Behavior of Recursive Partitioning and Its Implications
for Heterogeneous Causal Effect Estimation [8.394633341978007]
Decision tree learning is increasingly being used for pointwise inference.
We show that adaptive decision trees can fail to achieve convergence rates of convergence in the norm with non-vanishing probability.
We show that random forests can remedy the situation, turning poor performing trees into nearly optimal procedures.
arXiv Detail & Related papers (2022-11-19T21:28:30Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Modelling hetegeneous treatment effects by quantitle local polynomial
decision tree and forest [0.0]
This paper builds on Breiman's 2001 random forest tree (RFT) and Wager et al.'s causal tree to parameterize the nonparametric problem.
We propose a decision tree using quantile classification according to fixed rules combined with classical estimation of local samples, which we call the quantile local linear causal tree (QLPRT) and forest (QLPRF)
arXiv Detail & Related papers (2021-11-30T12:02:16Z) - Minimax Rates for High-Dimensional Random Tessellation Forests [0.0]
Mondrian forests is the first class of random forests for which minimax rates were obtained in arbitrary dimension.
We show that a large class of random forests with general split directions also achieve minimax optimal convergence rates in arbitrary dimension.
arXiv Detail & Related papers (2021-09-22T06:47:38Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - Efficient Semi-Implicit Variational Inference [65.07058307271329]
We propose an efficient and scalable semi-implicit extrapolational (SIVI)
Our method maps SIVI's evidence to a rigorous inference of lower gradient values.
arXiv Detail & Related papers (2021-01-15T11:39:09Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Convex Polytope Trees [57.56078843831244]
convex polytope trees (CPT) are proposed to expand the family of decision trees by an interpretable generalization of their decision boundary.
We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given.
arXiv Detail & Related papers (2020-10-21T19:38:57Z) - Large-Scale Methods for Distributionally Robust Optimization [53.98643772533416]
We prove that our algorithms require a number of evaluations gradient independent of training set size and number of parameters.
Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.
arXiv Detail & Related papers (2020-10-12T17:41:44Z) - Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm.
We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.