Parallel Approaches to Accelerate Bayesian Decision Trees
- URL: http://arxiv.org/abs/2301.09090v1
- Date: Sun, 22 Jan 2023 09:56:26 GMT
- Title: Parallel Approaches to Accelerate Bayesian Decision Trees
- Authors: Efthyvoulos Drousiotis, Paul G. Spirakis, and Simon Maskell
- Abstract summary: We propose two methods for exploiting parallelism in the MCMC.
In the first, we replace the MCMC with another numerical Bayesian approach.
In the second, we consider data partitioning.
- Score: 1.9728521995447947
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Markov Chain Monte Carlo (MCMC) is a well-established family of algorithms
primarily used in Bayesian statistics to sample from a target distribution when
direct sampling is challenging. Existing work on Bayesian decision trees uses
MCMC. Unfortunately, this can be slow, especially when considering large
volumes of data. It is hard to parallelise the accept-reject component of the
MCMC. None-the-less, we propose two methods for exploiting parallelism in the
MCMC: in the first, we replace the MCMC with another numerical Bayesian
approach, the Sequential Monte Carlo (SMC) sampler, which has the appealing
property that it is an inherently parallel algorithm; in the second, we
consider data partitioning. Both methods use multi-core processing with a
HighPerformance Computing (HPC) resource. We test the two methods in various
study settings to determine which method is the most beneficial for each test
case. Experiments show that data partitioning has limited utility in the
settings we consider and that the use of the SMC sampler can improve run-time
(compared to the sequential implementation) by up to a factor of 343.
Related papers
- SpreadNUTS -- Moderate Dynamic Extension of Paths for No-U-Turn Sampling
& Partitioning Visited Regions [0.0]
This paper introduces modifications to a specific Hamiltonian Monte Carlo (HMC) algorithm known as the no-U-turn sampler (NUTS)
NUTS aims to explore the sample space faster than NUTS, yielding a sampler that has faster convergence to the true distribution than NUTS.
arXiv Detail & Related papers (2023-07-09T05:00:25Z) - Bayesian Decision Trees Inspired from Evolutionary Algorithms [64.80360020499555]
We propose a replacement of the Markov Chain Monte Carlo (MCMC) with an inherently parallel algorithm, the Sequential Monte Carlo (SMC)
Experiments show that SMC combined with the Evolutionary Algorithms (EA) can produce more accurate results compared to MCMC in 100 times fewer iterations.
arXiv Detail & Related papers (2023-05-30T06:17:35Z) - Learning Hidden Markov Models Using Conditional Samples [72.20944611510198]
This paper is concerned with the computational complexity of learning the Hidden Markov Model (HMM)
In this paper, we consider an interactive access model, in which the algorithm can query for samples from the conditional distributions of the HMMs.
Specifically, we obtain efficient algorithms for learning HMMs in settings where we have query access to the exact conditional probabilities.
arXiv Detail & Related papers (2023-02-28T16:53:41Z) - Single MCMC Chain Parallelisation on Decision Trees [0.9137554315375919]
We propose a method to parallelise a single MCMC decision tree chain on an average laptop or personal computer.
Experiments showed that we could achieve 18 times faster running time provided that the serial and the parallel implementation are statistically identical.
arXiv Detail & Related papers (2022-07-26T07:07:51Z) - Knowledge Removal in Sampling-based Bayesian Inference [86.14397783398711]
When single data deletion requests come, companies may need to delete the whole models learned with massive resources.
Existing works propose methods to remove knowledge learned from data for explicitly parameterized models.
In this paper, we propose the first machine unlearning algorithm for MCMC.
arXiv Detail & Related papers (2022-03-24T10:03:01Z) - Compressed Monte Carlo with application in particle filtering [11.84836209560411]
We introduce the theory and practice of a Compressed MC (C-MC) scheme to compress the statistical information contained in a set of random samples.
C-MC is useful within particle filtering and adaptive IS algorithms, as shown by three novel schemes introduced in this work.
arXiv Detail & Related papers (2021-07-18T14:32:04Z) - DG-LMC: A Turn-key and Scalable Synchronous Distributed MCMC Algorithm [21.128416842467132]
We derive a user-friendly centralised distributed MCMC algorithm with provable scaling in high-dimensional settings.
We illustrate the relevance of the proposed methodology on both synthetic and real data experiments.
arXiv Detail & Related papers (2021-06-11T10:37:14Z) - Variational Combinatorial Sequential Monte Carlo Methods for Bayesian
Phylogenetic Inference [4.339931151475307]
We introduce Vari Combinatorial Monte Carlo (VCSMC), a powerful framework that establishes variational search to learn over intricate structures.
We show that VCSMC and CSMC are efficient and explore higher probability spaces than existing methods on a range of tasks.
arXiv Detail & Related papers (2021-05-31T19:44:24Z) - CoCoMoT: Conformance Checking of Multi-Perspective Processes via SMT
(Extended Version) [62.96267257163426]
We introduce the CoCoMoT (Computing Conformance Modulo Theories) framework.
First, we show how SAT-based encodings studied in the pure control-flow setting can be lifted to our data-aware case.
Second, we introduce a novel preprocessing technique based on a notion of property-preserving clustering.
arXiv Detail & Related papers (2021-03-18T20:22:50Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z) - Involutive MCMC: a Unifying Framework [64.46316409766764]
We describe a wide range of MCMC algorithms in terms of iMCMC.
We formulate a number of "tricks" which one can use as design principles for developing new MCMC algorithms.
We demonstrate the latter with two examples where we transform known reversible MCMC algorithms into more efficient irreversible ones.
arXiv Detail & Related papers (2020-06-30T10:21:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.