A Matrix Chernoff Bound for Markov Chains and Its Application to
Co-occurrence Matrices
- URL: http://arxiv.org/abs/2008.02464v2
- Date: Thu, 29 Oct 2020 14:01:27 GMT
- Title: A Matrix Chernoff Bound for Markov Chains and Its Application to
Co-occurrence Matrices
- Authors: Jiezhong Qiu, Chi Wang, Ben Liao, Richard Peng, Jie Tang
- Abstract summary: We prove a Chernoff-type bound for sums of matrix-valued random variables sampled via a regular (aperiodic and irreducible) finite Markov chain.
Our proof is based on the matrix expander (regular undirected graph) Chernoff bound [Garg et al. STOC '18] and scalar Chernoff-Hoeffding bounds for Markov chains.
Our matrix Chernoff bound for Markov chains can be applied to analyze the behavior of co-occurrence statistics for sequential data.
- Score: 30.49132891333353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We prove a Chernoff-type bound for sums of matrix-valued random variables
sampled via a regular (aperiodic and irreducible) finite Markov chain.
Specially, consider a random walk on a regular Markov chain and a Hermitian
matrix-valued function on its state space. Our result gives exponentially
decreasing bounds on the tail distributions of the extreme eigenvalues of the
sample mean matrix. Our proof is based on the matrix expander (regular
undirected graph) Chernoff bound [Garg et al. STOC '18] and scalar
Chernoff-Hoeffding bounds for Markov chains [Chung et al. STACS '12].
Our matrix Chernoff bound for Markov chains can be applied to analyze the
behavior of co-occurrence statistics for sequential data, which have been
common and important data signals in machine learning. We show that given a
regular Markov chain with $n$ states and mixing time $\tau$, we need a
trajectory of length $O(\tau (\log{(n)}+\log{(\tau)})/\epsilon^2)$ to achieve
an estimator of the co-occurrence matrix with error bound $\epsilon$. We
conduct several experiments and the experimental results are consistent with
the exponentially fast convergence rate from theoretical analysis. Our result
gives the first bound on the convergence rate of the co-occurrence matrix and
the first sample complexity analysis in graph representation learning.
Related papers
- Entrywise error bounds for low-rank approximations of kernel matrices [55.524284152242096]
We derive entrywise error bounds for low-rank approximations of kernel matrices obtained using the truncated eigen-decomposition.
A key technical innovation is a delocalisation result for the eigenvectors of the kernel matrix corresponding to small eigenvalues.
We validate our theory with an empirical study of a collection of synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-23T12:26:25Z) - Re-Analyze Gauss: Bounds for Private Matrix Approximation via Dyson
Brownian Motion [28.431572772564518]
Given a symmetric matrix $M$ and a vector $lambda$, we present new bounds on the Frobenius-distance utility of the Gaussian mechanism for approximating $M$ by a matrix.
Our bounds depend on both $lambda$ and the gaps in the eigenvalues of $M$, and hold whenever the top $k+1$ eigenvalues of $M$ have sufficiently large gaps.
arXiv Detail & Related papers (2022-11-11T18:54:01Z) - A rapidly mixing Markov chain from any gapped quantum many-body system [2.321323878201932]
We consider a bit string $x$ from a $pi(x)=|langle x|psirangle|2$, where $psi$ is the unique ground state of a local Hamiltonian $H$.
Our main result describes a direct link between the inverse spectral gap of $H$ and the mixing time of an associated continuous-time Markov Chain with steady state $pi$.
arXiv Detail & Related papers (2022-07-14T16:38:42Z) - An Equivalence Principle for the Spectrum of Random Inner-Product Kernel
Matrices with Polynomial Scalings [21.727073594338297]
This study is motivated by applications in machine learning and statistics.
We establish the weak limit of the empirical distribution of these random matrices in a scaling regime.
Our results can be characterized as the free additive convolution between a Marchenko-Pastur law and a semicircle law.
arXiv Detail & Related papers (2022-05-12T18:50:21Z) - Leverage Score Sampling for Tensor Product Matrices in Input Sparsity
Time [54.65688986250061]
We give an input sparsity time sampling algorithm for approximating the Gram matrix corresponding to the $q$-fold column-wise tensor product of $q$ matrices.
Our sampling technique relies on a collection of $q$ partially correlated random projections which can be simultaneously applied to a dataset $X$ in total time.
arXiv Detail & Related papers (2022-02-09T15:26:03Z) - When Random Tensors meet Random Matrices [50.568841545067144]
This paper studies asymmetric order-$d$ spiked tensor models with Gaussian noise.
We show that the analysis of the considered model boils down to the analysis of an equivalent spiked symmetric textitblock-wise random matrix.
arXiv Detail & Related papers (2021-12-23T04:05:01Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - On the Stability of Random Matrix Product with Markovian Noise:
Application to Linear Stochastic Approximation and TD Learning [33.24726879325671]
This paper studies the exponential stability of random matrix products driven by a general (possibly) state space Markov chain.
It is a cornerstone in the analysis of algorithms in machine learning.
arXiv Detail & Related papers (2021-01-30T08:39:38Z) - Linear-Sample Learning of Low-Rank Distributions [56.59844655107251]
We show that learning $ktimes k$, rank-$r$, matrices to normalized $L_1$ distance requires $Omega(frackrepsilon2)$ samples.
We propose an algorithm that uses $cal O(frackrepsilon2log2fracepsilon)$ samples, a number linear in the high dimension, and nearly linear in the matrices, typically low, rank proofs.
arXiv Detail & Related papers (2020-09-30T19:10:32Z) - A Random Matrix Analysis of Random Fourier Features: Beyond the Gaussian
Kernel, a Precise Phase Transition, and the Corresponding Double Descent [85.77233010209368]
This article characterizes the exacts of random Fourier feature (RFF) regression, in the realistic setting where the number of data samples $n$ is all large and comparable.
This analysis also provides accurate estimates of training and test regression errors for large $n,p,N$.
arXiv Detail & Related papers (2020-06-09T02:05:40Z) - Asymptotic errors for convex penalized linear regression beyond Gaussian
matrices [23.15629681360836]
We consider the problem of learning a coefficient vector $x_0$ in $RN$ from noisy linear observations.
We provide a rigorous derivation of an explicit formula for the mean squared error.
We show that our predictions agree remarkably well with numerics even for very moderate sizes.
arXiv Detail & Related papers (2020-02-11T13:43:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.