Computationally Efficient Approximations for Matrix-based Renyi's
Entropy
- URL: http://arxiv.org/abs/2112.13720v1
- Date: Mon, 27 Dec 2021 14:59:52 GMT
- Title: Computationally Efficient Approximations for Matrix-based Renyi's
Entropy
- Authors: Tieliang Gong and Yuxin Dong and Shujian Yu and Hong Chen and Bo Dong
and Chen Li and Qinghua Zheng
- Abstract summary: Recently developed matrix based Renyi's entropy enables measurement of information in data.
computation of such quantity involves the trace operator on a PSD matrix $G$ to power $alpha$(i.e., $tr(Galpha)$.
We present computationally efficient approximations to this new entropy functional that can reduce its complexity to even significantly less than $O(n2)$.
- Score: 33.72108955447222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recently developed matrix based Renyi's entropy enables measurement of
information in data simply using the eigenspectrum of symmetric positive semi
definite (PSD) matrices in reproducing kernel Hilbert space, without estimation
of the underlying data distribution. This intriguing property makes the new
information measurement widely adopted in multiple statistical inference and
learning tasks. However, the computation of such quantity involves the trace
operator on a PSD matrix $G$ to power $\alpha$(i.e., $tr(G^\alpha)$), with a
normal complexity of nearly $O(n^3)$, which severely hampers its practical
usage when the number of samples (i.e., $n$) is large. In this work, we present
computationally efficient approximations to this new entropy functional that
can reduce its complexity to even significantly less than $O(n^2)$. To this
end, we first develop randomized approximations to $\tr(\G^\alpha)$ that
transform the trace estimation into matrix-vector multiplications problem. We
extend such strategy for arbitrary values of $\alpha$ (integer or non-integer).
We then establish the connection between the matrix-based Renyi's entropy and
PSD matrix approximation, which enables us to exploit both clustering and block
low-rank structure of $\G$ to further reduce the computational cost. We
theoretically provide approximation accuracy guarantees and illustrate the
properties of different approximations. Large-scale experimental evaluations on
both synthetic and real-world data corroborate our theoretical findings,
showing promising speedup with negligible loss in accuracy.
Related papers
- Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators.
Key to our solution is a novel projection technique based on ideas from harmonic analysis.
Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Effective Minkowski Dimension of Deep Nonparametric Regression: Function
Approximation and Statistical Theories [70.90012822736988]
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to intrinsic data structures.
This paper introduces a relaxed assumption that input data are concentrated around a subset of $mathbbRd$ denoted by $mathcalS$, and the intrinsic dimension $mathcalS$ can be characterized by a new complexity notation -- effective Minkowski dimension.
arXiv Detail & Related papers (2023-06-26T17:13:31Z) - Robust and Fast Measure of Information via Low-rank Representation [20.1194871649588]
We propose a novel measure of information, termed low-rank matrix-based R'enyi's entropy.
Low-rank variant is more sensitive to informative perturbations induced by changes in underlying distributions.
We conduct large-scale experiments to evaluate the effectiveness of this new information measure.
arXiv Detail & Related papers (2022-11-30T06:49:56Z) - Optimal Randomized Approximations for Matrix based Renyi's Entropy [16.651155375441796]
We develop random approximations for integer order $alpha$ cases and series approximations for non-integer $alpha$ cases.
Large-scale simulations and real-world applications validate the effectiveness of the developed approximations.
arXiv Detail & Related papers (2022-05-16T02:24:52Z) - A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions.
We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$.
We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Fast and Accurate Pseudoinverse with Sparse Matrix Reordering and
Incremental Approach [4.710916891482697]
A pseudoinverse is a generalization of a matrix inverse, which has been extensively utilized in machine learning.
FastPI is a novel incremental singular value decomposition (SVD) based pseudoinverse method for sparse matrices.
We show that FastPI computes the pseudoinverse faster than other approximate methods without loss of accuracy.
arXiv Detail & Related papers (2020-11-09T07:47:10Z) - Hutch++: Optimal Stochastic Trace Estimation [75.45968495410048]
We introduce a new randomized algorithm, Hutch++, which computes a $(1 pm epsilon)$ approximation to $tr(A)$ for any positive semidefinite (PSD) $A$.
We show that it significantly outperforms Hutchinson's method in experiments.
arXiv Detail & Related papers (2020-10-19T16:45:37Z) - Compressed sensing of low-rank plus sparse matrices [3.8073142980733]
This manuscript develops similar guarantees showing that $mtimes n$ that can be expressed as the sum of a rank-rparse matrix and a $s-sparse matrix can be recovered by computationally tractable methods.
Results are shown for synthetic problems, dynamic-foreground/static separation, multispectral imaging, and Robust PCA.
arXiv Detail & Related papers (2020-07-18T15:36:11Z) - Phase retrieval in high dimensions: Statistical and computational phase
transitions [27.437775143419987]
We consider the problem of reconstructing a $mathbfXstar$ from $m$ (possibly noisy) observations.
In particular, the information-theoretic transition to perfect recovery for full-rank matrices appears at $alpha=1$ and $alpha=2$.
Our work provides an extensive classification of the statistical and algorithmic thresholds in high-dimensional phase retrieval.
arXiv Detail & Related papers (2020-06-09T13:03:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.