Stochastic geometry to generalize the Mondrian Process
- URL: http://arxiv.org/abs/2002.00797v3
- Date: Mon, 13 Sep 2021 18:04:55 GMT
- Title: Stochastic geometry to generalize the Mondrian Process
- Authors: Eliza O'Reilly and Ngoc Tran
- Abstract summary: We show that a STIT process with cut directions drawn from a discrete distribution can be efficiently simulated by lifting to a higher dimensional axis-aligned Mondrian process.
We also characterize all possible kernels that stationary STIT processes and their mixtures can approximate.
This allows for precise comparisons between the Mondrian forest, the Mondrian kernel and the Laplace kernel in density estimation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The stable under iterated tessellation (STIT) process is a stochastic process
that produces a recursive partition of space with cut directions drawn
independently from a distribution over the sphere. The case of random
axis-aligned cuts is known as the Mondrian process. Random forests and Laplace
kernel approximations built from the Mondrian process have led to efficient
online learning methods and Bayesian optimization. In this work, we utilize
tools from stochastic geometry to resolve some fundamental questions concerning
STIT processes in machine learning. First, we show that a STIT process with cut
directions drawn from a discrete distribution can be efficiently simulated by
lifting to a higher dimensional axis-aligned Mondrian process. Second, we
characterize all possible kernels that stationary STIT processes and their
mixtures can approximate. We also give a uniform convergence rate for the
approximation error of the STIT kernels to the targeted kernels, generalizing
the work of [3] for the Mondrian case. Third, we obtain consistency results for
STIT forests in density estimation and regression. Finally, we give a formula
for the density estimator arising from an infinite STIT random forest. This
allows for precise comparisons between the Mondrian forest, the Mondrian kernel
and the Laplace kernel in density estimation. Our paper calls for further
developments at the novel intersection of stochastic geometry and machine
learning.
Related papers
- Score matching for sub-Riemannian bridge sampling [2.048226951354646]
Recent progress in machine learning can be modified to allow training of score approximators on sub-Riemannian gradients.
We perform numerical experiments exemplifying samples from the bridge process on the Heisenberg group and the concentration of this process for small time.
arXiv Detail & Related papers (2024-04-23T17:45:53Z) - Posterior Contraction Rates for Mat\'ern Gaussian Processes on
Riemannian Manifolds [51.68005047958965]
We show that intrinsic Gaussian processes can achieve better performance in practice.
Our work shows that finer-grained analyses are needed to distinguish between different levels of data-efficiency.
arXiv Detail & Related papers (2023-09-19T20:30:58Z) - Distributed Sketching for Randomized Optimization: Exact
Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging.
We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z) - Gaussian Processes and Statistical Decision-making in Non-Euclidean
Spaces [96.53463532832939]
We develop techniques for broadening the applicability of Gaussian processes.
We introduce a wide class of efficient approximations built from this viewpoint.
We develop a collection of Gaussian process models over non-Euclidean spaces.
arXiv Detail & Related papers (2022-02-22T01:42:57Z) - Shape Modeling with Spline Partitions [3.222802562733787]
We propose a novel parallelized Bayesian nonparametric approach to partition a domain with curves, enabling complex data-shapes to be acquired.
We apply our method to HIV-1-infected human macrophage image dataset, and also simulated datasets sets to illustrate our approach.
arXiv Detail & Related papers (2021-08-05T10:33:05Z) - A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space.
We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense.
We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z) - Sparse Algorithms for Markovian Gaussian Processes [18.999495374836584]
Sparse Markovian processes combine the use of inducing variables with efficient Kalman filter-likes recursion.
We derive a general site-based approach to approximate the non-Gaussian likelihood with local Gaussian terms, called sites.
Our approach results in a suite of novel sparse extensions to algorithms from both the machine learning and signal processing, including variational inference, expectation propagation, and the classical nonlinear Kalman smoothers.
The derived methods are suited to literature-temporal data, where the model has separate inducing points in both time and space.
arXiv Detail & Related papers (2021-03-19T09:50:53Z) - Pathwise Conditioning of Gaussian Processes [72.61885354624604]
Conventional approaches for simulating Gaussian process posteriors view samples as draws from marginal distributions of process values at finite sets of input locations.
This distribution-centric characterization leads to generative strategies that scale cubically in the size of the desired random vector.
We show how this pathwise interpretation of conditioning gives rise to a general family of approximations that lend themselves to efficiently sampling Gaussian process posteriors.
arXiv Detail & Related papers (2020-11-08T17:09:37Z) - Schoenberg-Rao distances: Entropy-based and geometry-aware statistical
Hilbert distances [12.729120803225065]
We study a class of statistical Hilbert distances that we term the Schoenberg-Rao distances.
We derive novel closed-form distances between mixtures of Gaussian distributions.
Our method constitutes a practical alternative to Wasserstein distances and we illustrate its efficiency on a broad range of machine learning tasks.
arXiv Detail & Related papers (2020-02-19T18:48:33Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.