Determinantal Point Processes in Randomized Numerical Linear Algebra
- URL: http://arxiv.org/abs/2005.03185v1
- Date: Thu, 7 May 2020 00:39:52 GMT
- Title: Determinantal Point Processes in Randomized Numerical Linear Algebra
- Authors: Micha{\l} Derezi\'nski and Michael W. Mahoney
- Abstract summary: Numerical Linear Algebra (RandNLA) uses randomness to develop improved algorithms for matrix problems that arise in scientific computing, data science, machine learning, etc.
Recent work has uncovered deep and fruitful connections between DPPs and RandNLA which lead to new guarantees and improved algorithms.
- Score: 80.27102478796613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Randomized Numerical Linear Algebra (RandNLA) uses randomness to develop
improved algorithms for matrix problems that arise in scientific computing,
data science, machine learning, etc. Determinantal Point Processes (DPPs), a
seemingly unrelated topic in pure and applied mathematics, is a class of
stochastic point processes with probability distribution characterized by
sub-determinants of a kernel matrix. Recent work has uncovered deep and
fruitful connections between DPPs and RandNLA which lead to new guarantees and
improved algorithms that are of interest to both areas. We provide an overview
of this exciting new line of research, including brief introductions to RandNLA
and DPPs, as well as applications of DPPs to classical linear algebra tasks
such as least squares regression, low-rank approximation and the Nystr\"om
method. For example, random sampling with a DPP leads to new kinds of unbiased
estimators for least squares, enabling more refined statistical and inferential
understanding of these algorithms; a DPP is, in some sense, an optimal
randomized algorithm for the Nystr\"om method; and a RandNLA technique called
leverage score sampling can be derived as the marginal distribution of a DPP.
We also discuss recent algorithmic developments, illustrating that, while not
quite as efficient as standard RandNLA techniques, DPP-based algorithms are
only moderately more expensive.
Related papers
- Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning [49.0767291348921]
Randomized Numerical Linear Algebra (RandNLA) is an area which uses randomness to develop improved algorithms for ubiquitous matrix problems.
This article provides a self-contained overview of RandNLA, in light of these developments.
arXiv Detail & Related papers (2024-06-17T02:30:55Z) - Online Learning Under A Separable Stochastic Approximation Framework [20.26530917721778]
We propose an online learning algorithm for a class of machine learning models under a separable approximation framework.
We show that the proposed algorithm produces more robust and test performance when compared to other popular learning algorithms.
arXiv Detail & Related papers (2023-05-12T13:53:03Z) - Scalable MCMC Sampling for Nonsymmetric Determinantal Point Processes [45.40646054226403]
A determinantal point process (DPP) assigns a probability to every subset of $n$ items.
Recent work has studied an approximate chain Monte Carlo sampling algorithm for NDPPs restricted to size-$k$ subsets.
We develop a scalable MCMC sampling algorithm for $k$-NDPPs with low-rank kernels.
arXiv Detail & Related papers (2022-07-01T15:22:12Z) - Generalization Bounds for Data-Driven Numerical Linear Algebra [24.961270871124217]
Data-driven algorithms can adapt their internal structure or parameters to inputs from unknown application-specific distributions, by learning from a training sample of inputs.
Several recent works have applied this approach to problems in numerical linear algebra, obtaining significant empirical gains in performance.
In this work we prove generalization bounds for those algorithms, within the PAC-learning framework for data-driven algorithm selection proposed by Gupta and Roughgarden.
arXiv Detail & Related papers (2022-06-16T02:23:45Z) - Photonic co-processors in HPC: using LightOn OPUs for Randomized
Numerical Linear Algebra [53.13961454500934]
We show that the randomization step for dimensionality reduction may itself become the computational bottleneck on traditional hardware.
We show that randomization can be significantly accelerated, at negligible precision loss, in a wide range of important RandNLA algorithms.
arXiv Detail & Related papers (2021-04-29T15:48:52Z) - A Polynomial Time Algorithm for Learning Halfspaces with Tsybakov Noise [55.45544638410858]
We study the problem of PAC learning homogeneous halfspaces in the presence of Tsybakov noise.
Our algorithm learns the true halfspace within any accuracy $epsilon$.
arXiv Detail & Related papers (2020-10-04T22:19:06Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Accelerated Message Passing for Entropy-Regularized MAP Inference [89.15658822319928]
Maximum a posteriori (MAP) inference in discrete-valued random fields is a fundamental problem in machine learning.
Due to the difficulty of this problem, linear programming (LP) relaxations are commonly used to derive specialized message passing algorithms.
We present randomized methods for accelerating these algorithms by leveraging techniques that underlie classical accelerated gradient.
arXiv Detail & Related papers (2020-07-01T18:43:32Z) - Scalable Learning and MAP Inference for Nonsymmetric Determinantal Point
Processes [29.56615511158156]
We develop a learning algorithm with space and time requirements linear in $M$ by introducing a new NDPP kernel decomposition.
We also derive a linear-complexity NDPP maximum a posteriori (MAP) inference algorithm that applies not only to our new kernel but also to that of prior work.
arXiv Detail & Related papers (2020-06-17T13:42:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.