Related papers: Metric Transforms and Low Rank Matrices via Representation Theory of the Real Hyperrectangle

Metric Transforms and Low Rank Matrices via Representation Theory of the Real Hyperrectangle

URL: http://arxiv.org/abs/2011.11503v2
Date: Thu, 5 Aug 2021 03:52:41 GMT
Title: Metric Transforms and Low Rank Matrices via Representation Theory of the Real Hyperrectangle
Authors: Josh Alman, Timothy Chu, Gary Miller, Shyam Narayanan, Mark Sellke, Zhao Song
Abstract summary: We show how to compute the eigenvectors and eigenvalues of certain matrices arising from hyperrectangles. We then use our new technique along with these connections to prove several new structural results.
Score: 17.808087068037985
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In this paper, we develop a new technique which we call representation theory of the real hyperrectangle, which describes how to compute the eigenvectors and eigenvalues of certain matrices arising from hyperrectangles. We show that these matrices arise naturally when analyzing a number of different algorithmic tasks such as kernel methods, neural network training, natural language processing, and the design of algorithms using the polynomial method. We then use our new technique along with these connections to prove several new structural results in these areas, including: $\bullet$ A function is a positive definite Manhattan kernel if and only if it is a completely monotone function. These kernels are widely used across machine learning; one example is the Laplace kernel which is widely used in machine learning for chemistry. $\bullet$ A function transforms Manhattan distances to Manhattan distances if and only if it is a Bernstein function. This completes the theory of Manhattan to Manhattan metric transforms initiated by Assouad in 1980. $\bullet$ A function applied entry-wise to any square matrix of rank $r$ always results in a matrix of rank $< 2^{r-1}$ if and only if it is a polynomial of sufficiently low degree. This gives a converse to a key lemma used by the polynomial method in algorithm design. Our work includes a sophisticated combination of techniques from different fields, including metric embeddings, the polynomial method, and group representation theory.

Related papers

Explaining Deep Network Classification of Matrices: A Case Study on Monotonicity [0.0]
This work demonstrates a methodology for using deep learning to discover simple, practical criteria for classifying matrices.<n>By combining a high-performance neural network with explainable AI (XAI) techniques, we can distill a model's learned strategy into human-interpretable rules.
arXiv Detail & Related papers (2025-07-30T10:55:44Z)
Geometry of fibers of the multiplication map of deep linear neural networks [0.0]
We study the geometry of the set of quivers of composable matrices which multiply to a fixed matrix. Our solution is presented in three forms: a Poincar'e series in equivariant cohomology, a quadratic integer program, and an explicit formula.
arXiv Detail & Related papers (2024-11-29T18:36:03Z)
Bauer's Spectral Factorization Method for Low Order Multiwavelet Filter Design [0.6138671548064355]
We introduce a fast method for matrix spectral factorization based on Bauer$'$s method. We convert Bauer's method into a nonlinear matrix equation (NME) The NME is solved by two different numerical algorithms.
arXiv Detail & Related papers (2023-12-09T00:26:52Z)
Exploring and Learning in Sparse Linear MDPs without Computationally Intractable Oracles [39.10180309328293]
In this paper we revisit linear MDPs from the perspective of feature selection. Our main result is the first-time algorithm for this problem. We show that they do exist and can be computed efficiently via convex programming.
arXiv Detail & Related papers (2023-09-18T03:35:48Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Data-Driven Linear Complexity Low-Rank Approximation of General Kernel Matrices: A Geometric Approach [0.9453554184019107]
A kernel matrix may be defined as $K_ij = kappa(x_i,y_j)$ where $kappa(x,y)$ is a kernel function. We seek a low-rank approximation to a kernel matrix where the sets of points $X$ and $Y$ are large.
arXiv Detail & Related papers (2022-12-24T07:15:00Z)
Scalable First-Order Bayesian Optimization via Structured Automatic Differentiation [4.061135251278187]
We show that a wide range of kernels gives rise to structured matrices, enabling an exact $mathcalO(n2d)$ matrix-vector multiply for gradient observations and $mathcalO(n2d2)$ for Hessian observations. Our methods apply to virtually all canonical kernels and automatically extend to complex kernels, like the neural network, radial basis function network, and spectral mixture kernels.
arXiv Detail & Related papers (2022-06-16T17:59:48Z)
Block-encoding dense and full-rank kernels using hierarchical matrices: applications in quantum numerical linear algebra [6.338178373376447]
We propose a block-encoding scheme of the hierarchical matrix structure on a quantum computer. Our method can improve the runtime of solving quantum linear systems of dimension $N$ to $O(kappa operatornamepolylog(fracNvarepsilon))$.
arXiv Detail & Related papers (2022-01-27T05:24:02Z)
Fast Differentiable Matrix Square Root [65.67315418971688]
We propose two more efficient variants to compute the differentiable matrix square root. For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP) The other method is to use Matrix Pad'e Approximants (MPA)
arXiv Detail & Related papers (2022-01-21T12:18:06Z)
A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups [115.58550697886987]
We provide a completely general algorithm for solving for the equivariant layers of matrix groups. In addition to recovering solutions from other works as special cases, we construct multilayer perceptrons equivariant to multiple groups that have never been tackled before. Our approach outperforms non-equivariant baselines, with applications to particle physics and dynamical systems.
arXiv Detail & Related papers (2021-04-19T17:21:54Z)
Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models [56.98280399449707]
We show that there exists an $epsilon$-cover for $S$ of cardinality $M = (k/epsilon)O_d(k1/d)$. Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models hidden variables.
arXiv Detail & Related papers (2020-12-14T18:14:08Z)
Quantum algorithms for spectral sums [50.045011844765185]
We propose new quantum algorithms for estimating spectral sums of positive semi-definite (PSD) matrices. We show how the algorithms and techniques used in this work can be applied to three problems in spectral graph theory.
arXiv Detail & Related papers (2020-11-12T16:29:45Z)
Sketching Transformed Matrices with Applications to Natural Language Processing [76.6222695417524]
We propose a space-efficient sketching algorithm for computing the product of a given small matrix with the transformed matrix. We show that our approach obtains small error and is efficient in both space and time.
arXiv Detail & Related papers (2020-02-23T03:07:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.