Related papers: Computationally Efficient Approximations for Matrix-based Renyi's Entropy

Computationally Efficient Approximations for Matrix-based Renyi's Entropy

URL: http://arxiv.org/abs/2112.13720v1
Date: Mon, 27 Dec 2021 14:59:52 GMT
Title: Computationally Efficient Approximations for Matrix-based Renyi's Entropy
Authors: Tieliang Gong and Yuxin Dong and Shujian Yu and Hong Chen and Bo Dong and Chen Li and Qinghua Zheng
Abstract summary: Recently developed matrix based Renyi's entropy enables measurement of information in data. computation of such quantity involves the trace operator on a PSD matrix $G$ to power $alpha$(i.e., $tr(Galpha)$. We present computationally efficient approximations to this new entropy functional that can reduce its complexity to even significantly less than $O(n2)$.
Score: 33.72108955447222
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The recently developed matrix based Renyi's entropy enables measurement of information in data simply using the eigenspectrum of symmetric positive semi definite (PSD) matrices in reproducing kernel Hilbert space, without estimation of the underlying data distribution. This intriguing property makes the new information measurement widely adopted in multiple statistical inference and learning tasks. However, the computation of such quantity involves the trace operator on a PSD matrix $G$ to power $\alpha$(i.e., $tr(G^\alpha)$), with a normal complexity of nearly $O(n^3)$, which severely hampers its practical usage when the number of samples (i.e., $n$) is large. In this work, we present computationally efficient approximations to this new entropy functional that can reduce its complexity to even significantly less than $O(n^2)$. To this end, we first develop randomized approximations to $\tr(\G^\alpha)$ that transform the trace estimation into matrix-vector multiplications problem. We extend such strategy for arbitrary values of $\alpha$ (integer or non-integer). We then establish the connection between the matrix-based Renyi's entropy and PSD matrix approximation, which enables us to exploit both clustering and block low-rank structure of $\G$ to further reduce the computational cost. We theoretically provide approximation accuracy guarantees and illustrate the properties of different approximations. Large-scale experimental evaluations on both synthetic and real-world data corroborate our theoretical findings, showing promising speedup with negligible loss in accuracy.

Related papers

Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators. Key to our solution is a novel projection technique based on ideas from harmonic analysis. Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z)
Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure. We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z)
Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories [70.90012822736988]
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to intrinsic data structures. This paper introduces a relaxed assumption that input data are concentrated around a subset of $mathbbRd$ denoted by $mathcalS$, and the intrinsic dimension $mathcalS$ can be characterized by a new complexity notation -- effective Minkowski dimension.
arXiv Detail & Related papers (2023-06-26T17:13:31Z)
Robust and Fast Measure of Information via Low-rank Representation [20.1194871649588]
We propose a novel measure of information, termed low-rank matrix-based R'enyi's entropy. Low-rank variant is more sensitive to informative perturbations induced by changes in underlying distributions. We conduct large-scale experiments to evaluate the effectiveness of this new information measure.
arXiv Detail & Related papers (2022-11-30T06:49:56Z)
Optimal Randomized Approximations for Matrix based Renyi's Entropy [16.651155375441796]
We develop random approximations for integer order $alpha$ cases and series approximations for non-integer $alpha$ cases. Large-scale simulations and real-world applications validate the effectiveness of the developed approximations.
arXiv Detail & Related papers (2022-05-16T02:24:52Z)
A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions. We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$. We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z)
Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization. We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
Fast and Accurate Pseudoinverse with Sparse Matrix Reordering and Incremental Approach [4.710916891482697]
A pseudoinverse is a generalization of a matrix inverse, which has been extensively utilized in machine learning. FastPI is a novel incremental singular value decomposition (SVD) based pseudoinverse method for sparse matrices. We show that FastPI computes the pseudoinverse faster than other approximate methods without loss of accuracy.
arXiv Detail & Related papers (2020-11-09T07:47:10Z)
Hutch++: Optimal Stochastic Trace Estimation [75.45968495410048]
We introduce a new randomized algorithm, Hutch++, which computes a $(1 pm epsilon)$ approximation to $tr(A)$ for any positive semidefinite (PSD) $A$. We show that it significantly outperforms Hutchinson's method in experiments.
arXiv Detail & Related papers (2020-10-19T16:45:37Z)
Compressed sensing of low-rank plus sparse matrices [3.8073142980733]
This manuscript develops similar guarantees showing that $mtimes n$ that can be expressed as the sum of a rank-rparse matrix and a $s-sparse matrix can be recovered by computationally tractable methods. Results are shown for synthetic problems, dynamic-foreground/static separation, multispectral imaging, and Robust PCA.
arXiv Detail & Related papers (2020-07-18T15:36:11Z)
Phase retrieval in high dimensions: Statistical and computational phase transitions [27.437775143419987]
We consider the problem of reconstructing a $mathbfXstar$ from $m$ (possibly noisy) observations. In particular, the information-theoretic transition to perfect recovery for full-rank matrices appears at $alpha=1$ and $alpha=2$. Our work provides an extensive classification of the statistical and algorithmic thresholds in high-dimensional phase retrieval.
arXiv Detail & Related papers (2020-06-09T13:03:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.