Related papers: Fast Optimization of Weighted Sparse Decision Trees for use in Optimal Treatment Regimes and Optimal Policy Design

Fast Optimization of Weighted Sparse Decision Trees for use in Optimal Treatment Regimes and Optimal Policy Design

URL: http://arxiv.org/abs/2210.06825v1
Date: Thu, 13 Oct 2022 08:16:03 GMT
Title: Fast Optimization of Weighted Sparse Decision Trees for use in Optimal Treatment Regimes and Optimal Policy Design
Authors: Ali Behrouz, Mathias Lecuyer, Cynthia Rudin, Margo Seltzer
Abstract summary: We present three algorithms for efficient sparse weighted decision tree optimization. The first approach directly optimize the weighted loss function; however, it tends to be computationally inefficient for large datasets. Second approach, which scales more efficiently, transforms weights to integer values and uses data duplication to transform the weighted decision tree optimization problem into an unweighted (but larger) counterpart. Third algorithm, which scales to much larger datasets, uses a randomized procedure that samples each data point with a probability proportional to its weight.
Score: 16.512942230284576
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sparse decision trees are one of the most common forms of interpretable models. While recent advances have produced algorithms that fully optimize sparse decision trees for prediction, that work does not address policy design, because the algorithms cannot handle weighted data samples. Specifically, they rely on the discreteness of the loss function, which means that real-valued weights cannot be directly used. For example, none of the existing techniques produce policies that incorporate inverse propensity weighting on individual data points. We present three algorithms for efficient sparse weighted decision tree optimization. The first approach directly optimizes the weighted loss function; however, it tends to be computationally inefficient for large datasets. Our second approach, which scales more efficiently, transforms weights to integer values and uses data duplication to transform the weighted decision tree optimization problem into an unweighted (but larger) counterpart. Our third algorithm, which scales to much larger datasets, uses a randomized procedure that samples each data point with a probability proportional to its weight. We present theoretical bounds on the error of the two fast methods and show experimentally that these methods can be two orders of magnitude faster than the direct optimization of the weighted loss, without losing significant accuracy.

Related papers

A Closed-form Solution for Weight Optimization in Fully-connected Feed-forward Neural Networks [2.1301560294088318]
This work addresses weight optimization problem for fully-connected feed-forward neural networks. The proposed approach offers the solution for weight optimization in closed-form by means of least squares (LS) methodology. Our simulation and empirical results show that the proposed scheme, BPLS, works well and is competitive with existing ones in terms of accuracy, but significantly surpasses them in terms of running time.
arXiv Detail & Related papers (2024-01-12T17:03:55Z)
Efficient Bi-Level Optimization for Recommendation Denoising [31.968068788022403]
implicit feedback possesses a high degree of noise, which significantly undermines recommendation quality. We model recommendation denoising as a bi-level optimization problem. The inner optimization aims to derive an effective model for the recommendation, as well as guiding the weight determination. We employ a weight generator to avoid the storage of weights and a one-step gradient-matching-based loss to significantly reduce computational time.
arXiv Detail & Related papers (2022-10-19T06:36:21Z)
On multivariate randomized classification trees: $l_0$-based sparsity, VC~dimension and decomposition methods [0.9346127431927981]
We investigate the nonlinear continuous optimization formulation proposed in Blanquero et al. We first consider alternative methods to sparsify such trees based on concave approximations of the $l_0$ norm" We propose a general decomposition scheme and an efficient version of it. Experiments on larger datasets show that the proposed decomposition method is able to significantly reduce the training times without compromising the accuracy.
arXiv Detail & Related papers (2021-12-09T22:49:08Z)
Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization. We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
Online Model Selection for Reinforcement Learning with Function Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret. We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z)
Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework [40.635566748735386]
Most neural network pruning methods prune the network model along one (depth, width, or resolution) solely to meet a computational budget. We argue that pruning should be conducted along three dimensions comprehensively. Our proposed algorithm surpasses state-of-the-art pruning algorithms and even neural architecture search-based algorithms.
arXiv Detail & Related papers (2020-10-10T02:30:47Z)
Boosting Data Reduction for the Maximum Weight Independent Set Problem Using Increasing Transformations [59.84561168501493]
We introduce new generalized data reduction and transformation rules for the maximum weight independent set problem. Surprisingly, these so-called increasing transformations can simplify the problem and also open up the reduction space to yield even smaller irreducible graphs later in the algorithm. Our algorithm computes significantly smaller irreducible graphs on all except one instance, solves more instances to optimality than previously possible, is up to two orders of magnitude faster than the best state-of-the-art solver, and finds higher-quality solutions than solvers DynWVC and HILS.
arXiv Detail & Related papers (2020-08-12T08:52:50Z)
New Oracle-Efficient Algorithms for Private Synthetic Data Release [52.33506193761153]
We present three new algorithms for constructing differentially private synthetic data. The algorithms satisfy differential privacy even in the worst case. Compared to the state-of-the-art method High-Dimensional Matrix Mechanism citeMcKennaMHM18, our algorithms provide better accuracy in the large workload.
arXiv Detail & Related papers (2020-07-10T15:46:05Z)
Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives. We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z)
Private Stochastic Convex Optimization: Optimal Rates in Linear Time [74.47681868973598]
We study the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions. A recent work of Bassily et al. has established the optimal bound on the excess population loss achievable given $n$ samples. We describe two new techniques for deriving convex optimization algorithms both achieving the optimal bound on excess loss and using $O(minn, n2/d)$ gradient computations.
arXiv Detail & Related papers (2020-05-10T19:52:03Z)
Optimal Sparse Decision Trees [25.043477914272046]
This work introduces the first practical algorithm for optimal decision trees for binary variables. The algorithm is a co-design of analytical bounds that reduce the search space and modern systems techniques. Our experiments highlight advantages in scalability, speed, and proof of optimality.
arXiv Detail & Related papers (2019-04-29T17:56:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.