Related papers: Des-q: a quantum algorithm to provably speedup retraining of decision trees

Des-q: a quantum algorithm to provably speedup retraining of decision trees

URL: http://arxiv.org/abs/2309.09976v4
Date: Thu, 23 May 2024 19:09:22 GMT
Title: Des-q: a quantum algorithm to provably speedup retraining of decision trees
Authors: Niraj Kumar, Romina Yalovetzky, Changhao Li, Pierre Minssen, Marco Pistoia,
Abstract summary: We introduce Des-q, a novel quantum algorithm to construct and retrain decision trees for regression and binary classification tasks. We benchmark the simulated version of Des-q against the state-of-the-art classical methods on multiple data sets. Our algorithm exhibits similar performance to the state-of-the-art decision trees while significantly speeding up the periodic tree retraining.
Score: 2.7262923206583136
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Decision trees are widely adopted machine learning models due to their simplicity and explainability. However, as training data size grows, standard methods become increasingly slow, scaling polynomially with the number of training examples. In this work, we introduce Des-q, a novel quantum algorithm to construct and retrain decision trees for regression and binary classification tasks. Assuming the data stream produces small, periodic increments of new training examples, Des-q significantly reduces the tree retraining time, achieving a logarithmic complexity in the combined total number of old and new examples, even accounting for the time needed to load the new samples into quantum-accessible memory. Our approach to grow the tree from any given node involves performing piecewise linear splits to generate multiple hyperplanes, thus partitioning the input feature space into distinct regions. To determine the suitable anchor points for these splits, we develop an efficient quantum-supervised clustering method, building upon the q-means algorithm introduced by Kerenidis \etal We benchmark the simulated version of Des-q against the state-of-the-art classical methods on multiple data sets and observe that our algorithm exhibits similar performance to the state-of-the-art decision trees while significantly speeding up the periodic tree retraining.

Related papers

IsoSEL: Isometric Structural Entropy Learning for Deep Graph Clustering in Hyperbolic Space [57.036143666293334]
Graph clustering is a longstanding topic in machine learning. In this paper, we study a challenging yet practical problem: deep graph clustering without K considering the imbalance in reality. We present a novel IsoSEL framework for deep graph clustering, where we design a hyperbolic neural network to learn partitioning tree in the Lorentz model of hyperbolic space.
arXiv Detail & Related papers (2025-04-14T08:21:41Z)
QC-Forest: a Classical-Quantum Algorithm to Provably Speedup Retraining of Random Forest [2.6436521007616114]
Random Forest (RF) is a popular tree-ensemble method for supervised learning, prized for its ease of use and flexibility. Online RF models require to account for new training data to maintain model accuracy. We propose QC-Forest, a classical-quantum algorithm designed to time-efficiently retrain RF models in the streaming setting.
arXiv Detail & Related papers (2024-06-17T18:21:03Z)
Learning accurate and interpretable tree-based models [27.203303726977616]
We develop approaches to design tree-based learning algorithms given repeated access to data from the same domain.<n>We propose novel parameterized classes of node splitting criteria in top-down algorithms, which interpolate between popularly used entropy and Gini impurity based criteria.<n>We extend our results to tuning popular tree-based ensembles, including random forests and gradient-boosted trees.
arXiv Detail & Related papers (2024-05-24T20:10:10Z)
Tractable Bounding of Counterfactual Queries by Knowledge Compilation [51.47174989680976]
We discuss the problem of bounding partially identifiable queries, such as counterfactuals, in Pearlian structural causal models. A recently proposed iterated EM scheme yields an inner approximation of those bounds by sampling the initialisation parameters. We show how a single symbolic knowledge compilation allows us to obtain the circuit structure with symbolic parameters to be replaced by their actual values.
arXiv Detail & Related papers (2023-10-05T07:10:40Z)
A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms [64.3064050603721]
We generalize Runge-Kutta neural network to a recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms. We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields similar iterations to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta solvers for ordinary differential equations.
arXiv Detail & Related papers (2022-11-22T16:30:33Z)
Constructing Optimal Contraction Trees for Tensor Network Quantum Circuit Simulation [1.2704529528199062]
One of the key problems in quantum circuit simulation is the construction of a contraction tree. We introduce a novel time algorithm for constructing an optimal contraction tree. We show that our method achieves superior results on a majority of tested quantum circuits.
arXiv Detail & Related papers (2022-09-07T02:50:30Z)
Discrete Tree Flows via Tree-Structured Permutations [5.929956715430168]
discrete flow-based models cannot be straightforwardly optimized with conventional deep learning methods because gradients of discrete functions are undefined or zero. Our approach seeks to reduce computational burden and remove the need for pseudo-gradients by developing a discrete flow based on decision trees.
arXiv Detail & Related papers (2022-07-04T23:11:04Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Simple is better: Making Decision Trees faster using random sampling [4.284674689172996]
In recent years, gradient boosted decision trees have become popular in building robust machine learning models on big data. We show theoretically and empirically that choosing the split points uniformly at random provides the same or even better performance in terms of accuracy and computational efficiency.
arXiv Detail & Related papers (2021-08-19T17:00:21Z)
Robustifying Algorithms of Learning Latent Trees with Vector Variables [92.18777020401484]
We present the sample complexities of Recursive Grouping (RG) and Chow-Liu Recursive Grouping (CLRG) We robustify RG, CLRG, Neighbor Joining (NJ) and Spectral NJ (SNJ) by using the truncated inner product. We derive the first known instance-dependent impossibility result for structure learning of latent trees.
arXiv Detail & Related papers (2021-06-02T01:37:52Z)
Efficient Computation of Expectations under Spanning Tree Distributions [67.71280539312536]
We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models. Our algorithms exploit a fundamental connection between gradients and expectations, which allows us to derive efficient algorithms.
arXiv Detail & Related papers (2020-08-29T14:58:26Z)
MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)
Quantum Ensemble for Classification [2.064612766965483]
A powerful way to improve performance in machine learning is to construct an ensemble that combines the predictions of multiple models. We propose a new quantum algorithm that exploits quantum superposition, entanglement and interference to build an ensemble of classification models.
arXiv Detail & Related papers (2020-07-02T11:26:54Z)
Stochastic tree ensembles for regularized nonlinear regression [0.913755431537592]
This paper develops a novel tree ensemble method for nonlinear regression, which we refer to as XBART. By combining regularization and search strategies from Bayesian modeling with computationally efficient techniques, the new method attains state-of-the-art performance.
arXiv Detail & Related papers (2020-02-09T14:37:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.