Single MCMC Chain Parallelisation on Decision Trees
- URL: http://arxiv.org/abs/2207.12688v1
- Date: Tue, 26 Jul 2022 07:07:51 GMT
- Title: Single MCMC Chain Parallelisation on Decision Trees
- Authors: Efthyvoulos Drousiotis, Paul G. Spirakis
- Abstract summary: We propose a method to parallelise a single MCMC decision tree chain on an average laptop or personal computer.
Experiments showed that we could achieve 18 times faster running time provided that the serial and the parallel implementation are statistically identical.
- Score: 0.9137554315375919
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decision trees are highly famous in machine learning and usually acquire
state-of-the-art performance. Despite that, well-known variants like CART, ID3,
random forest, and boosted trees miss a probabilistic version that encodes
prior assumptions about tree structures and shares statistical strength between
node parameters. Existing work on Bayesian decision trees depend on Markov
Chain Monte Carlo (MCMC), which can be computationally slow, especially on high
dimensional data and expensive proposals. In this study, we propose a method to
parallelise a single MCMC decision tree chain on an average laptop or personal
computer that enables us to reduce its run-time through multi-core processing
while the results are statistically identical to conventional sequential
implementation. We also calculate the theoretical and practical reduction in
run time, which can be obtained utilising our method on multi-processor
architectures. Experiments showed that we could achieve 18 times faster running
time provided that the serial and the parallel implementation are statistically
identical.
Related papers
- Terminating Differentiable Tree Experts [77.2443883991608]
We propose a neuro-symbolic Differentiable Tree Machine that learns tree operations using a combination of transformers and Representation Products.
We first remove a series of different transformer layers that are used in every step by introducing a mixture of experts.
We additionally propose a new termination algorithm to provide the model the power to choose how many steps to make automatically.
arXiv Detail & Related papers (2024-07-02T08:45:38Z) - TREE: Tree Regularization for Efficient Execution [4.205565040528205]
We present a method to reduce path lengths by rewarding uneven probability distributions during the training of decision trees.
Specifically, we regularize the impurity of the CART algorithm in order to favor not only low impurity, but also highly asymmetric distributions for the evaluation of split criteria.
arXiv Detail & Related papers (2024-06-18T12:01:06Z) - Des-q: a quantum algorithm to provably speedup retraining of decision trees [2.7262923206583136]
We introduce Des-q, a novel quantum algorithm to construct and retrain decision trees for regression and binary classification tasks.
We benchmark the simulated version of Des-q against the state-of-the-art classical methods on multiple data sets.
Our algorithm exhibits similar performance to the state-of-the-art decision trees while significantly speeding up the periodic tree retraining.
arXiv Detail & Related papers (2023-09-18T17:56:08Z) - Bayesian Decision Trees Inspired from Evolutionary Algorithms [64.80360020499555]
We propose a replacement of the Markov Chain Monte Carlo (MCMC) with an inherently parallel algorithm, the Sequential Monte Carlo (SMC)
Experiments show that SMC combined with the Evolutionary Algorithms (EA) can produce more accurate results compared to MCMC in 100 times fewer iterations.
arXiv Detail & Related papers (2023-05-30T06:17:35Z) - Parallel Tree Kernel Computation [0.0]
We devise a parallel implementation of the sequential algorithm for the computation of some tree kernels of two finite sets of trees.
Results show that the proposed parallel algorithm outperforms the sequential one in terms of latency.
arXiv Detail & Related papers (2023-05-12T18:16:45Z) - Parallel Approaches to Accelerate Bayesian Decision Trees [1.9728521995447947]
We propose two methods for exploiting parallelism in the MCMC.
In the first, we replace the MCMC with another numerical Bayesian approach.
In the second, we consider data partitioning.
arXiv Detail & Related papers (2023-01-22T09:56:26Z) - Efficient semidefinite-programming-based inference for binary and
multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF.
We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - On Effective Parallelization of Monte Carlo Tree Search [51.15940034629022]
Monte Carlo Tree Search (MCTS) is computationally expensive as it requires a substantial number of rollouts to construct the search tree.
How to design effective parallel MCTS algorithms has not been systematically studied and remains poorly understood.
We demonstrate how proposed necessary conditions can be adopted to design more effective parallel MCTS algorithms.
arXiv Detail & Related papers (2020-06-15T21:36:00Z) - Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic
Circuits [99.59941892183454]
We propose Einsum Networks (EiNets), a novel implementation design for PCs.
At their core, EiNets combine a large number of arithmetic operations in a single monolithic einsum-operation.
We show that the implementation of Expectation-Maximization (EM) can be simplified for PCs, by leveraging automatic differentiation.
arXiv Detail & Related papers (2020-04-13T23:09:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.