Permanents of matrix ensembles: computation, distribution, and geometry
- URL: http://arxiv.org/abs/2602.10141v3
- Date: Mon, 16 Feb 2026 18:03:39 GMT
- Title: Permanents of matrix ensembles: computation, distribution, and geometry
- Authors: Igor Rivin,
- Abstract summary: We use the GPU to greaatly accelerate the computation of permanents over $mathbbC,$ $mathbbR,$ $mathbbF_p$ and $mathbbQ.$<n>We study the permanent along geodesics on the unitary group.<n>For the geodesic from the identity to the $n$-cycle permutation matrix, we find a universal scaling function $f(t)=frac1nln|perm((t))|$ that is independent of$n$ in
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We report on a computational and experimental study of permanents. On the computational side, we use the GPU to greaatly accelerate the computation of permanents over $\mathbb{C},$ $\mathbb{R},$ $\mathbb{F}_p$ and $\mathbb{Q}.$ First, for Haar-distributed unitary matrices~$U$, the permanent $\perm(U)$ follows a circularly-symmetric complex Gaussian distribution $\mathcal{CN}(0,σ^2)$ -- we confirm this via a number of tests for $n$ up to~23 with $50{,}000$ samples. The DFT matrix permanent is an extreme outlier for every prime $n\ge 7$. In contrast, for Haar-random \emph{orthogonal} matrices~$O$, the permanent $\perm(O)$ is approximately real Gaussian but with positive excess kurtosis that decays as~$O(1/n)$, indicating slower convergence. For matrices with Gaussian entries (GUE, GOE, Ginibre), the permanent follows an $α$-stable distribution with stability index $α\approx 1.0$--$1.4$, well below the Gaussian value $α=2$. We test Aaronson's conjecture that $|\perm(X)|^2$ is asymptotically lognormal for Gaussian~$X$: it is plausible for the complex Ginibre and GOE ensembles, but appears to fail for GUE and real Ginibre, where the $α$-stable tails prevent convergence. Anti-concentration, however, holds for all Gaussian ensembles and is more robust than for Haar unitaries. Secondly, we study the permanent along geodesics on the unitary group. For the geodesic from the identity to the $n$-cycle permutation matrix, we find a universal scaling function $f(t)=\frac{1}{n}\ln|\perm(γ(t))|$ that is independent of~$n$ in the large-$n$ limit, with a midpoint value \[ \perm(γ({\textstyle\frac12})) = (-1)^{(n-1)/2}\cdot 2e^{-n}\bigl(1+\tfrac{1}{3n}+O(n^{-2})\bigr) \] for odd~$n$ and zero for even~$n$. We also study the geodesic forom the identity to the DFT matrix.
Related papers
- Information-Computation Tradeoffs for Noiseless Linear Regression with Oblivious Contamination [65.37519531362157]
We show that any efficient Statistical Query algorithm for this task requires VSTAT complexity at least $tildeOmega(d1/2/alpha2)$.
arXiv Detail & Related papers (2025-10-12T15:42:44Z) - Eigenvalue distribution of the Neural Tangent Kernel in the quadratic scaling [5.142160533428576]
We compute the eigenvalue distribution of the neural tangent kernel of a two-layer neural network under a specific scaling of dimension.<n>We describe the distribution as the free multiplicative convolution of the Marchenko--Pastur distribution with a deterministic distribution depending on $sigma$ and $D$.
arXiv Detail & Related papers (2025-08-27T16:41:01Z) - MLPs at the EOC: Spectrum of the NTK [7.826806223782053]
We study the properties of the Neuralstyle (NTK) $oversetscriptscriptstyleinftyK.<n>$Delta_phi = fracb2a2+b2$ determines the rate at which the condition number of the NTK matrix converges to its limit as depth increases.
arXiv Detail & Related papers (2025-01-22T21:12:51Z) - The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$.
As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z) - Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch.
We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work.
We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z) - Provably learning a multi-head attention layer [55.2904547651831]
Multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models.
In this work, we initiate the study of provably learning a multi-head attention layer from random examples.
We prove computational lower bounds showing that in the worst case, exponential dependence on $m$ is unavoidable.
arXiv Detail & Related papers (2024-02-06T15:39:09Z) - Universality for the global spectrum of random inner-product kernel
matrices in the polynomial regime [12.221087476416056]
In this paper, we show that this phenomenon is universal, holding as soon as $X$ has i.i.d. entries with all finite moments.
In the case of non-integer $ell$, the Marvcenko-Pastur term disappears.
arXiv Detail & Related papers (2023-10-27T17:15:55Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Random matrices in service of ML footprint: ternary random features with
no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$.
We propose a novel random technique, called Ternary Random Feature (TRF)
The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z) - Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns [50.053491972003656]
It was previously shown that the functionals $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
Here, we show that $|mathbb E[R(z)] - tilde R(z)|_F
arXiv Detail & Related papers (2021-09-06T14:21:43Z) - Kernel Thinning [26.25415159542831]
kernel thinning is a new procedure for compressing a distribution $mathbbP$ more effectively than i.i.d. sampling or standard thinning.
We derive explicit non-asymptotic maximum mean discrepancy bounds for Gaussian, Mat'ern, and B-spline kernels.
arXiv Detail & Related papers (2021-05-12T17:56:42Z) - Algorithms and Hardness for Linear Algebra on Geometric Graphs [14.822517769254352]
We show that the exponential dependence on the dimension dimension $d in the celebrated fast multipole method of Greengard and Rokhlin cannot be improved.
This is the first formal limitation proven about fast multipole methods.
arXiv Detail & Related papers (2020-11-04T18:35:02Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.